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Abstract 

Anonymous social media platforms like Secret, Yik Yak, and Whisper have emerged as important tools for sharing ideas 
without the fear of judgment. Such anonymous platforms are also important in nations under authoritarian rule, where freedom of 
expression and the personal safety of message authors may depend on anonymity. Whether for fear of judgment or retribution, it 
is sometimes crucial to hide the identities of users who post sensitive messages. In this paper, we consider a global adversary who 
wishes to identify the author of a message; it observes either a snapshot of the spread of a message at a certain time, sampled 
timestamp metadata, or both. Recent advances in rumor source detection show that existing messaging protocols are vulnerable 
against such an adversary. We introduce a novel messaging protocol, which we call adaptive diffusion, and show that under 
the snapshot adversarial model, adaptive diffusion spreads content fast and achieves perfect obfuscation of the source when the 
underlying contact network is an infinite regular tree. That is, all users with the message are nearly equally likely to have been 
the origin of the message. When the contact network is an irregular tree, we characterize the probability of maximum likelihood 
detection by proving a concentration result over Galton-Watson trees. Experiments on a sampled Facebook network demonstrate 
that adaptive diffusion effectively hides the location of the source even when the graph is finite, irregular and has cycles. 

Index Terms 


privacy, diffusion, anonymous social media. 


I. Introduction 

Microblogging platforms are central to the fabric of the present Internet; popular examples include Twitter and Facebook. In 
such platforms, users propagate short messages (texts, images, videos) over a contact graph, which represents a social network 
in most cases. Message forwarding often occurs through built-in mechanisms that rely on user input, such as clicking “like" or 
“share" on a particular post. Brevity of message, fluidity of user interface, and trusted party communication combine to make 
these microblogging platforms a major communication mode of modern times. 

However, the popularity of microblogging services also makes them a prime target for invasive user monitoring by employers, 
service providers, or government agencies. This monitoring typically exploits metadata: non-content data that characterizes 
content, like timestamps. Metadata can often be as sensitive as data itself ID, 0; this reality was publicized by Michael 
Hayden, former Director of the CIA, with his observation that “We kill people based on metadata" 0. 

The alarming privacy implications of these platforms has spurred the growth of anonymous microblogging platforms, like 
Whisper 0, Yik Yak g], and the now-defunct Secret 0. These platforms enable users to share messages with their friends 
without revealing authorship metadata. 

Existing anonymous messaging services store both messages and authorship information on centralized servers, which makes 
them vulnerable to government subpoenas, hacking, or direct company access. An alternative solution would be to store this 
information in a distributed fashion; each node would know only its own friends, and message authorship information would 
never be transmitted to any party. Distributed systems are more robust to monitoring due to lack of central points of failure. 
However, even under distributed architectures, simple anonymous messaging protocols (such as those used by commercial 
anonymous microblogging apps) are still vulnerable against an adversary with side information, as proved in recent advances 
in rumor source detection 121, 0. In this work, we study a basic building block of the messaging protocol that would underpin 
truly anonymous microblogging platform - how to anonymously broadcast a single message on a contact network, even in the 
face of a strong deanonymizing adversary with access to metadata. Specifically, we focus on anonymous microblogging built 
atop an underlying social network, such as a network of phone contacts or Facebook friends. 

Adversaries. We consider three adversarial models, which capture different approaches to collecting metadata. In each case, 
the underlying contact network is modeled as a graph that is known to the adversary. Beyond that, the adversary could proceed 
in a few different ways. 

The adversary might use side channels to infer whether a node is infected, i.e., whether it received the message. If an 
adversary collects only infection metadata for all network users, we call it a snapshot adversary. This could represent a 
state-level adversary that attends a Twitter-organized protest; it implicitly learns who received the protest advertisement by 
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observing which individuals are physically present, but not the associated metadata. The snapshot adversary is well-studied 
in the literature, primarily in the related problem of source identification ||7], ||9|, lITOl . ifTTl . ifT^ . We focus primarily on the 
snapshot adversarial model in this paper. 

Alternatively, the adversary might explicitly corrupt some fraction of nodes by bribery or coercion; these corrupted spy 
nodes could pass along metadata like message timestamps and relay IDs. If an adversary only collects information from spies, 
we call it a spy-based adversary. A spy-based adversary could represent a government agency participating in social media 
to study users, for instance. The adversary’s reach may be limited by factors like account creation, contact network structure 
US, or the cost of corrupting participants. This adversarial model is discussed in detail in lfT4l . but we include the relevant 
theoretical results in this paper for the sake of completeness. 

Finally, an adversary could combine the spy-based and snapshot adversarial models by using both forms of metadata. If 
an adversary uses spies and a snapshot, we call it a spy-h snaps hot adversary. This adversarial model allows us to study the 
capabilities of both snapshot and spy metadata types, combining the results on snapshot adversary capabilities derived here 
with those of spy adversary capabilities derived in in lfT4l . 

Spreading models. In social networks, messages are typically propagated based on users’ approval, which is expressed via 
liking, sharing or retweeting. This mechanism, which enables social hltering and reduces spam, has inherent random delays 
associated with each user’s time of impression and decision to “like" the message (or not). Standard models of rumor spreading 
in networks explicitly model such random delays via a dijfiision process: messages are spread independently over different 
edges with a fixed probability of spreading (discrete time model) or an exponential spreading time (continuous time model). 
The designer can partially control the spreading rate by introducing artificial delays on top of the usual random delays due to 
users’ approval of the messages. 

We model this physical setup as a discrete-time system. At time f = 0, a single user v* G V starts to spread a message over 
the contact network G = (V, E) where users and contacts are represented by nodes and edges, respectively. Upon receiving 
the message, nodes approve it immediately. The assumption that all nodes are willing to approve and pass the message is 
common in rumor spreading analysis fT), IfTSlI . IfT^ . However, by assuming message approval is immediate, we abstract away 
the natural random delays typically modeled by diffusion. At the following timestep, the protocol decides which neighbors will 
receive the message, and how much propagation delay to introduce. Given this control, the system designer wishes to design 
a spreading protocol that makes message source inference difficult. 

Specihcally, after T timesteps, let Vr C V, Gt, and Nt = | Vt| denote the set of infected nodes, the subgraph of G containing 
only Vt, and the number of infected nodes, respectively, at a given time T, the adversary uses all available metadata to estimate 
the source. We assume no prior knowledge of the source, so the adversary computes a maximum-likelihood (ML) estimate of 
the source z)ml- We desire a spreading protocol that minimizes the probability of detection Pd = P('i)ML = v*). 

Current state-of-the-art: Diffusion is commonly used to model epidemic propagation over contact networks. While simplistic 
(it ignores factors like individual user preferences), diffusion is a commonly-studied and useful model due to its simplicity and 
first-order approximation of actual propagation dynamics. Critically, it captures the symmetric spreading used by most social 
media platforms. 

However, diffusion has been shown to exhibit poor anonymity properties; under the adversarial models we consider, the 
source can be identified reliably IT), IS]. We therefore seek a different spreading model with strong anonymity guarantees. We 
wish to achieve the following performance metrics: 

(а) We say a protocol has an order-optimal rate of spread if the expected time for the message to reach n nodes scales 
linearly compared to the time required by the fastest spreading protocol. 

(б) We say a protocol achieves a perfect obfuscation if the probability of source detection for the maximum likelihood 
estimator is order-optimal. The definition of optimality differs for different adversarial models, so we define this metric 
separately for each adversarial model. 

Contributions. We introduce adaptive diffusion, a novel messaging protocol with provable author anonymity guarantees against 
all of the discussed adversarial models. Whereas diffusion spreads the message symmetrically in all directions, adaptive diffusion 
breaks that symmetry (Figure [TJ. This has different implications for different adversarial models, but it consistently yields 
stronger anonymity guarantees than diffusion. Adaptive diffusion is also inherently distributed and spreads messages fast, i.e., 
the time it takes adaptive diffusion to reach n users is at most twice the time it takes the fastest spreading scheme which 
immediately passes the message to all its neighbors. 

We prove that over d-regular trees, adaptive diffusion provides perfect obfuscation of the source under the snapshot adversarial 
model. That is, the likelihood of an infected user being the source of the infection is equal among all infected users. We derive 
exact expressions for the probability of detection, and show that this expression is optimal for the snapshot adversary by 
providing a matching fundamental lower bound. 

In practice, the contact networks are not regular infinite trees. For a general class of graphs which can be finite, irregular 
and have cycles, we provide results of numerical experiments on real-world social networks and synthetic networks showing 
that the protocol hides the source at nearly the best possible level of obfuscation under the snapshot adversarial model. The 
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Fig. 1; Illustration of a spread of infection when spreading immediately (left) and under adaptive diffusion (right). 

same is true for spy-based adversaries; such simulation results for that adversarial model are discussed in El. Further, for a 
specihc family of random irregular inhnite trees, known as Galton-Watson trees, we characterize the probability of detection 
under adaptive diffusion and a snapshot adversary. In the process, we prove a strong concentration for the extreme paths in 
the Galton-Watson tree that consists of nodes with large degrees, which might be of independent interest. 

Finally, we characterize the probability of maximum likelihood detection of adaptive diffusion in various edge cases, such 
as when the adversary can take multiple snapshots, and when the underlying graph contains regular cycles, as in an inhnite 
lattice graph. 


Related work. Anonymous communication has been a popular research topic for decades. For instance, anonymous point-to- 
point communication allows a sender to communicate with a receiver without the receiver learning the sender’s identity. A 
great deal of work has emerged in this area, including Tor ini, Freenet nsi, Free Haven 021, and Tarzan l20l . In contrast to 
this body of work, we address the problem of anonymously broadcasting a message over an underlying contact network (e.g., 
a network of Facebook friendships or phone contacts). 

Anonymous broadcast communication has been most studied in context of the dining cryptographers’ (DC) problem. We 
diverge from the literature on this topic ED, ED, ED, El, ED in approach and formulation. We consider statistical 
spreading models rather than cryptographic encodings, accommodate computationally unbounded adversaries, and consider 
domain-specihc contact networks rather than a fully connected communication network. 

Recently, Riposte addressed a similar problem of anonymously writing to a public message board E6l . It uses techniques from 
private information retrieval to store multiple, corrupted copies of messages on distributed servers. This corruption is designed 
so that no subset of colluding servers can determine the author. However, Riposte places no restrictions on communication 
with the servers, thereby facilitating spam. Differences in the communication model and adversarial model prevent Riposte 
from effectively solving our problem of interest. 

Within the realm of statistical message spreading models, the problem of detecting the origin of an epidemic or the source 
of a rumor has been studied under the dijfusion model. Recent advances in jTl, ifTSl . 191 . IfTOll . ifTTl . ifT^ . ITh), E3, ESI, 
ED, ESI, ED show that it is possible to identify the source within a few hops with high probability. Consider an adversary 
who has access to the underlying contact network of friendship links and the snapshot of infected nodes at a certain time. The 
problem of locating a rumor source, hrst posed in El, naturally corresponds to graph-centrality-based inference algorithms; 
for a continuous time model, El, ca used the rumor centrality measure to correctly identify the source after time T (with 
probability converging to a positive number for large d-regular and random trees, and with probability proportional to Ijs/T 
for lines). The probability of identifying the source increases even further when multiple infections from the same source are 
observed M- With multiple sources of infections, spectral methods have been proposed for estimating the number of sources 
and the set of source nodes in Oni, ED- When infected nodes are allowed to recover as in the susceptible-infected-recovered 
(SIR) model, Jordan centrality was proposed in E3, ifl^ to estimate the source. In IfThl . it is shown that the Jordan center is 
still within a bounded hop distance from the true source with high probability, independent of the number of infected nodes. 

When the adversary collects timestamps (and other metadata) from spy nodes, standard diffusion reveals the location of the 
source El, ca, ED- However, ML estimation is known to be NP-hard ED, and analyzing the probability of detection is also 
challenging, the source can be effectively identihed. 

In summary, under natural, diffusion-based message spreading—as seen in almost every content-sharing platform today—an 
adversary with some side information can identify the rumor source with high conhdence. We overcome this vulnerability 
by asking the reverse question; can we design messaging protocols that spread fast while protecting the anonymity of the 
source? Related challenges include (a) identifying the best algorithm that the adversary might use to infer the location of 
the source; (b) providing analytical guarantees for the proposed spreading model; and (c) identifying the fundamental limit 
on what any spreading model can achieve. We address all of these challenges for snapshot adversarial model (Section 0, 
spy-based adversarial model (Section IV i, and hnally the spy-tsnapshot model (Section [V]). In this paper, our primarily focus 
is on the snapshot adversarial model; the spy-based and spy-tsnapshot adversaries are discussed in detail in ED- 
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Our work fits into a larger ecosystem that enables anonymous messaging; we implicitly assume that the ecosystem is healthy. 
For instance, we assume that nodes communicate securely in a distributed fashion, but anonymity-preserving, peer-to-peer (P2P) 
address lookup is still an active research area jMl, as is privacy-preserving distributed data storage in P2P systems BSl . We do 
not consider adversaries that operate below the application layer (e.g., by monitoring the network or even physical layer) 0 ^ . 
El. Lower-level solutions may be more appropriate against such an opponent, harnessing factors like physical proximity of 
users l3^ . In that space, physical layer security and privacy attacks pose a very real threat, as has been documented extensively 
in prior work El, EQI, Ell. 


Organization. The remainder of this paper is organized as follows: To begin, we introduce the general adaptive diffusion 
protocol in Section [I^ In Section III we describe how to specialize adaptive diffusion under a snapshot adversarial model. In 
Section IV we describe how to apply adaptive diffusion under a spy-based adversarial model. Combining the key insights of 
these two approaches, we introduce results from the spy-tsnapshot adversarial model in Section]^ For each adversarial model, 
we first describe the precise version of adaptive diffusion that applies to inhnite d-regular trees, and show that it achieves 
perfect obfuscation of the source. We then provide extensions to irregular trees. We conclude by presenting simulated results 
over real graphs: hnite, irregular, and containing cycles. In Section [Vl| we make a connection between adaptive diffusion on 
a line and Polya’s um processes. This connection, while interesting in itself, provides a novel analysis technique for precisely 
capturing the price of control packets that are passed along with the messages in order to coordinate the spread of messages 
as per adaptive diffusion. 


II. Adaptive diffusion 

In this section, we describe adaptive diffusion in its most general form, and leave for later sections the specihc choice of 
parameters involved. For the purpose of introducing adaptive diffusion, we specihcally on an inhnite d-regular tree as the 
underlying contact network. 

We step through the intuition of the adaptive diffusion spreading model with an example, partially illustrated in Figure 
The precise algorithm description is provided in Protocol [T] Adaptive diffusion ensures that the infected subgraph Gt at any 
even timestep f S { 2 ,4,...} is a balanced tree of depth f/ 2 , i.e. the hop distance from any leaf to the root (or the center of the 
graph) is f/2. We call the root node of G* the “virtual source” at time t, and denote it by Vt- We use vg = v* to denote the 
true source. To keep the regular structure at even timesteps, we use the odd timesteps to transition from one regular subtree 
Gt to another one Gt +2 with depth incremented by one. 

More concretely, the first three steps are always the same. At time f = 0, the rumor source v* selects, uniformly at random, 
one of its neighbors to be the virtual source V2', at time t = 1 , v* passes the message to V2- Next at f = 2, the new virtual 
source V2 infects all its uninfected neighbors forming G 2 (see Figure]^. Then node V2 chooses to either keep the virtual source 
token or to pass it along. 

If V 2 chooses to remain the virtual source i.e., V 4 = V 2 , it passes ‘infection messages’ to all the leaf nodes in the infected 
subtree, telling each leaf to infect all its uninfected neighbors. Since the virtual source is not connected to the leaf nodes 
in the infected subtree, these infection messages get relayed by the interior nodes of the subtree. This leads to Nt messages 
getting passed in total (we assume this happens instantaneously). These messages cause the rumor to spread symmetrically in 
all directions at f = 3. At f = 4, no spreading occurs (Figure right panel). 

If V 2 does not choose to remain the virtual source, it passes the virtual source token to a randomly chosen neighbor V 4 , 
excluding the previous virtual source (in this example, vq). Thus, if the virtual source moves, it moves away from the true 
source by one hop. Once V 4 receives the virtual source token, it sends out infection messages. However, these messages do not 
get passed back in the direction of the previous virtual source. This causes the infection to spread asymmetrically over only 
one subtree of the infected graph (G 3 in Figure left panel). In the subsequent timestep (t = 4), the virtual source remains 
hxed and passes the same infection messages again. After this second round of asymmetric spreading, the infected graph is 
once again symmetric about the virtual source V 4 (G 4 in Figure left panel). 

This process continues at each timestep: the virtual source vt chooses whether to keep or pass the virtual source token. 
Conditioned on this decision, the infected subgraph grows deterministically as needed to ensure symmetry about the new virtual 
source, Vt+2- 

As we will see momentarily, adaptive diffusion uses varying amounts of control information to coordinate the spread of 
messages. In some adversarial models (snapshot), this control information does not hurt anonymity; in others (spy-based), it 
can be problematic. We therefore introduce different implementations of adaptive diffusion as needed, using different amounts 
of control information. In each implementation, the resulting distribution of the random infection process is the same (if the 
same parameters are used). 

This random infection process can be dehned as a time-inhomogeneous (time-dependent) Markov chain over the state dehned 
by the location of the current virtual source {ut}tg{o, 2 , 4 ,...}- By the symmetry of the underlying contact network, which we 
assume is an inhnite d-regular tree, and the fact that the next virtual source is chosen uniformly at random among the neighbors 
of the current virtual source, it is sufficient to consider a Markov chain over the hop distance between the true source v* and 
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Algorithm 1 Adaptive Diffusion 

Input: contact network G = {V,E), source v*, time T, degree d 
Output: set of infected nodes Vt 
1: Vq ^ {u*}, h ^ 0, Vq ^ V* 

2: V* selects one of its neighbors u at random 

3: Vi 3— Vb U {u}, /i 3— 1, r;i 3— u 

4: let N{u) represent u’s neighbors 

5: Vb 3— Vi U N{u) \ {u*}, V 2 Vi 

6: t i — 3 

7: for t < T do 

8 : Vt-i selects a random variable X ~ C/(0,1) 

9: if X < ad{t — 1, h) then 

10 : for all V G N{vt-i) do 

11 : Infection Message(G,Ut_i,u,G't) 

12 : else 

13: Vt-i randomly selects u G N{vt-i) \ {vt- 2 } 

14: h i — -f 1 

15: Vt U 

16: for all V G N{vt) \ {r't-i} do 

17: Infection Message(G,Ut,r!,Vt) 

18: if f + 1 > T then 

19: break 

20: Infection Message(G,Ut,r!,Vt) 

21: f 3- f + 2 

22 : procedure Infection MESSAGE(G,M,7;,Vt) 

23: if u G Vt then 

24: for all w G N(v) \ {it} do 

25: Infection Message(G,i!,i(;,Gt) 

26: else 

27: Vt ^ Vt-2 U {u} 


Vt, the virtual source at time t. Therefore, we design a Markov chain over the state 

ht = 6Hiv*,Vt) , 

for even t, where SH{v*,Vt) denotes the hop distance between nodes v* (the true source) and vt (the virtual source). Figure 
1^ shows an example with (h2,/i4) = (1,2) on the left and {h2,h/t) = (1,1) on the right. 

At every even timestep, the protocol randomly determines whether to keep the virtual source token {ht+2 = ht) or to pass 
it (ht+2 = ht + 1 ). We specify the resulting time-inhomogeneous Markov chain over {ht}t^{2,4,6,...} by choosing appropriate 



Fig. 2: Adaptive diffusion over regular trees. Yellow nodes indicate the set of virtual sources (past and present), and for T = 4, 
the virtual source node is outlined in red. 
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transition probabilities as a function of time t and current state ht- For even t, we denote this probability by 

ad{t,h) = V(^ht +2 = ht\ht = h) , ( 1 ) 

where the subscript d denotes the degree of the underlying contact network. In Figure at f = 2, the virtual source remains 
at the current node (right) with probability Q! 3 ( 2 , 1), or passes the virtual source to a neighbor with probability 1 — a 3 ( 2 ,1) 
(left). The parameters ad{t, h) fully describe the transition probability of the Markov chain defined over ht £ {1,2,..., t/2}. 
For example, if we choose ad(f, h) = 1 for all t and h, then the virtual source never moves for t > 1. The message spreads 
almost symmetrically, so the source can be caught with high probability, much like diffusion. If we instead choose ad{t, h) = 0 
for all t and h, the virtual source always moves. This ensures that the source is always at one of the leaves of the infected 
subgraph. We return to this special case when addressing spy-based adversaries in Section |IV] 

The real challenge, then, is choosing the parameters ad{t,h), which fully specify the virtual source transition probabilities. 
These parameters can signihcantly alter the anonymity and spreading properties of adaptive diffusion. In this work, we explain 
how to choose this parameter ad to achieve desired source obfuscation. 

III. SNAPSHOT-BASED ADVERSARIAL MODEL 

Under the snapshot adversarial model, an adversary observes the infected subgraph Gt at a certain time T and produces an 
estimate v of the source v* of the message. Since the adversary is assumed to not have any prior information on which node 
is likely to be the source, we analyze the performance of the maximum likelihood estimator 

Dml = arg max P(G'tI'p)- (2) 

v£Gt 

We show that adaptive diffusion with appropriate parameters can achieve perfect obfuscation, i.e. the probability of detection 
for the ML estimator when n nodes are infected is close to 1/n: 

^(■Oml = E*|7Vr = n) = --I-o(-) . (3) 

' ' n \nJ 

This is the best source obfuscation that can be achieved by any protocol, since there are only n candidates for the source and 
they are all equally likely. 


A. Main Result (Snapshot Model) 

In this section, we show that for appropriate choice of parameters ad{t, h), we can achieve both fast spreading and perfect 
obfuscation over d-regular trees. We start by giving baseline spreading rates for deterministic spreading and diffusion. 

Given a contact network of an inhnite d-regular tree, d > 2, consider the following deterministic spreading protocol. At 
time t = 1, the source node infects all its neighbors. At f > 2, the nodes at the boundary of the infection spread the message 
to their uninfected neighbors. Thus, the message spreads one hop in every direction at each timestep. This approach is the 
fastest-possible spreading, infecting Nt = 1-1- d{{d — 1)^ — l)/{d — 2) nodes at time T, but the source is trivially identihed 
as the center of the infected subtree. In this case, the infected subtree is a balanced regular tree where all leaves are at equal 
depth from the source. 

Now consider a random diffusion model. At each timestep, each uninfected neighbor of an infected node is independently 
infected with probability q. In this case, E[iVT] = 1-1- qd{{d — 1)^ — l)/{d — 2), and it was shown in Q that the probability of 
correct detection for the maximum likelihood estimator of the rumor source is P('0ml = v*) > Cd for some positive constant 
Cd that only depends on the degree d. Hence, the source is only hidden in a constant number of nodes close to the center, 
even when the total number of infected nodes is arbitrarily large. 


Now we consider the spreading and anonymity properties of adaptive diffusion. Let = [pj^']he{ 


distribution of the state of the Markov chain at time t, i.e. = ¥(ht = h). The state transition can be represented as the 




denote the 


following ((f/ 2 ) 


1) X {t/2) dimensional column stochastic matrices: 

ad{t, 1 ) 

l-ad{t,l) adit,2) 

2 ) 


(t+2) _ 


n(‘) 


ad{t,t/2) 

1 - adit,t/2)_ 

We treat ht as strictly positive, because at time t = 0, when hg = 0, the virtual source is always passed. Thus, ht > 1 
afterwards. At all even t, we desire to be 

1 

(d-1) 


p(*) = 


d-2 

{d-iyn -1 


(d- i)‘/2-i 


(4) 
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for d > 2 and for d = 2, p'd) = (2/f)l4/2 where 1^/2 is all ones vector in There are d{d — 1)^ ^ nodes at distance h 
from the virtual source, and by symmetry all of them are equally likely to have been the source: 

V{GT\v\5H{v\vt) = h) = 

d -2 

d((d- l)‘/2 - 1) ’ 


for d > 2, which is independent of h. Hence, all the infected nodes (except for the virtual source) are equally likely to have 
been the source of the origin. This statement is made precise in Equation Q. 

Together with the desired probability distribution in Equation Q, this gives a recursion over t and h for computing the 
appropriate ad{t,hys. After some algebra and an initial state = 1, we get that the following choice ensures the desired 
Equation Q: 


ad{t,h) = 


(d_l)t/2+l_i 

t- 2 h +2 

t -\-2 


if d> 2 
if d = 2 


(5) 


With this choice of parameters, we show that adaptive diffusion spreads fast, infecting Nt = 0{{d — 1)*/^) nodes at time t 
and each of the nodes except for the virtual source is equally likely to have been the source. 

Theorem 3.1: Suppose the contact network is a d-regular tree with d > 2, and one node v* in G starts to spread a message 
according to Protocol at time t = 0, with ad{t,h) chosen according to Equation]^ At a certain time T > 0 an adversary 
estimates the location of the source v* using the maximum likelihood estimator uml- The following properties hold for Protocol 

m 

(a) the number of infected nodes at time T is 


Nt ^ 


2(d-l)('r+i)/=-d , 1 
(d-2) + ^ 

r +1 


if d > 2 
if d = 2 


(6) 


(5) the probability of source detection for the maximum likelihood estimator at time T is 


P (t'ML = V*) < 


d -2 


2(d-l)(^+i)/2_d 

(l/T) 


if d > 2 
if d = 2 


(7) 


(c) the expected hop-distance between the true source v* and its estimate uml under maximum likelihood estimation is lower 
bounded by 


E[d(t)ML,U*)]> (8) 

(Proof in Section |V11I-A| | 

Although this choice of parameters achieves perfect obfuscation, the spreading rate is slower than the deterministic spreading 
model, which infects 0((d— 1)^) nodes at time T. However, this type of constant-factor loss in the spreading rate is inevitable: 
the only way to deviate from the deterministic spreading model is to introduce appropriate delays. 

In order to spread according to adaptive diffusion with the prescribed ad{h,t), the system needs to know the degree d 
of the underlying contact network. However, performance is insensitive to knowledge of d for certain parameter settings, as 
shown in the following proposition. Specifically, one can choose ad{h,t) = 0 for all d, h, and t and still achieve performance 
comparable to the optimal choice. The main idea is that there are as many nodes in the boundary of the snapshot (leaf nodes) 
as there are in the interior, so it is sufficient to hide among the leaves. One caveat is that if the underlying contact network is a 
line (i.e. d = 2 ) then this approach fails since there are only two leaf nodes at any given time, and the probability of detection 
is trivially 1/2. 

Proposition 3.2: Suppose that the underlying contact network G is an infinite d-regular tree with d > 2, and one node v* 
in G starts to spread a message at time t = 0 according to Protocol with ad{h,t) = 0 for all d, h, and t. At a certain time 
T > 1 an adversary estimates the location of the source v* using the maximum likelihood estimator Dml- Then the following 
properties hold: 

(a) the number of infected nodes at time T > 1 is at least 


(b) 


JSfj' > 


(d_l)(T+l)/2 

d -2 


the probability of source detection for the maximum likelihood estimator at time T is 


(9) 


IP('5ml = V*) 


d-l 

2 + {d- 2 )Nt 


and 


( 10 ) 













( 11 ) 


(c) the expected hop-distance between the true source v* and its estimate v is lower bounded by 

E[5rr(u*, Dml)] > ■ 

(Proof in Section |VIII-B| i. 


Multiple snapshots. The results in Theorem 3.1 and Proposition |3.2| hold for a single snapshot. However, an adversary could 
in principle take multiple snapshots of the same message’s spread, at different points in time. We show that doing so increases 
the probability of detection at most by a logarithmic factor, compared to what it learns from the first snapshot (on average). 

Proposition 3.3: Suppose that the underlying contact network G is an infinite d-regular tree with d > 2, and one node v* 
in G starts to spread a message at time t = 0 according to Protocolwith ad{t,h) chosen according to Equation]^ At a 
certain time T > 0 an adversary observes a snapshot Gt with Nt nodes. In timesteps {ri,T2, ... ,Tm}, where Ti > T for 
all z G {1,2,.. .to}, the adversary again observes snapshots Gt^. The adversary then estimates the location of the source v* 
using a maximum likelihood estimator uml. based on knowledge of all observed snapshots. Then the probability of source 
detection for the maximum likelihood estimator at time T is upper bounded as follows: 


(wml = V *) < G 


logd_i Nt 
Nx — 1 


logd_iiVr 

Nx 


( 12 ) 


where the constant G depends only only on the tree degree d. 

(Proof in Section IVIII-Q . 

This result suggests that an adversary cannot learn much more than the information it learns from the first snapshot; i.e., 
the probability of detection increases at most from 0{1/Nx) to 0{logNx/Nx). Moving forward, we will assume that the 
snapshot adversary observes only one snapshot, at time T. 


B. Irregular Trees 

In this section, we study adaptive diffusion on irregular trees, with potentially different degrees at the vertices. Although the 
degrees are irregular, we still apply adaptive diffusion with adg{t, hys chosen for a specific do that might be mismatched with 
the graph due to degree irregularities. There are a few challenges in this degree-mismatched adaptive diffusion. First, finding 
the maximum likelihood estimate of the source is not immediate, due to degree irregularities. Second, it is not clear a priori 
which choice of do is good. We first show an efficient message-passing algorithm for computing the maximum likelihood 
source estimate. Using this estimate, we illustrate through simulations how adaptive diffusion performs and show that the 
detection probability is not too sensitive to the choice of do as long as do is above a threshold that depends on the degree 
distribution. 

Then, for the special choice of do = oo, we precisely characterize the maximum likelihood probability of detection and 
demonstrate that adaptive diffusion does not provide perfect obfuscation. Doing so requires proving a concentration result 
for an extreme value defined over Gabon-Watson branching processes, which may be of independent interest. We use the 
associated analysis to propose a modification of adaptive diffusion called preferential-attachment adaptive diffusion (PAAD), 
which empirically improves the probability of detection over irregular trees, compared to standard adaptive diffusion. 


Efficient ML estimation. To keep the discussion simple, we assume that T is even. The same approach can be naturally extended 
to odd T. Since the spreading pattern in adaptive diffiision is entirely deterministic given the sequence of virtual sources at each 
timestep, computing the likelihood P(Gt|'u* = v) is equivalent to computing the probability of the virtual source moving from v 
to vx over T timesteps. On trees, there is only one path from v to vx and since we do not allow the virtual source to “backtrack", 
we only need to compute the probability of every virtual source sequence (vo,V 2 , ■ ■ ■ ,vx) that meets the constraint vo = v. 
Due to the Markov property exhibited by adaptive diffusion, we have P(GT|{(ut,/it)}tG{2,4,...,T}) = Ylt<T-iP{vt+ 2 \vt, ht), 

t even 

where ht = SH{vo,Vt). For t even, F{vt+ 2 \vt,ht) = ad{t,ht) if Vt = Vt +2 and ~d‘‘-i ‘ otherwise. Here d^^ denotes the 

degree of node vt in G. Given a virtual source trajectory V = {vo,V 2 , ■ ■ ■ ,vx), let Jx = {ji^ ■ ■ ■ GshGo^vt)) denote the 
timesteps at which a new virtual source is introduced, with 1 < ji < T . It always holds that ji = 2 because after t = 0, the 
true source chooses a new virtual source and V 2 vo- If the virtual source at f = 2 were to keep the token exactly once after 
receiving it (so V 2 = V 4 ), then j2 = 6, and so forth. To find the likelihood of a node being the true source, we sum over all 
such trajectories 


P(Gt|wo) 




E n 


yT-p ■'PGS{vq ,vj' ,T) 


1 




+ ht) + l{t+ 2 Gj 7 '-p}(l — OLd{t, ht))) , 

t<T 
t even 


( 13 ) 


B. 
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where 1 is the indicator function and 

S{vo,vt,T) — {V : V = (uq, t^2) 5 wt) is a valid trajectory of the virtual source}. Intuitively, part Ayg of the above 

expression is the probability of choosing the set of virtual sources specified by V, and part Byg is the probability of keeping 
or passing the virtual source token at the specified timesteps. Equation ( [T3] l holds for both regular and irregular trees. Since 
the path between two nodes in a tree is unique, and part Ayg is (approximately) the product of node degrees in that path, Ayg 
is identical for all trajectories V. Pulling Ayg out of the summation, we wish to compute the summation over all valid paths 
V of part Byg (for ease of exposition, we will use Byg to refer to this whole summation). Although there are combinatorially 
many valid paths, we can simplify the formula in Equation ( [T3l l for the particular choice of ad{t, hys defined in Q. 

Proposition 3.4: Suppose that the underlying contact network G is an infinite tree with degree of each node larger than one. 
One node v* in G starts to spread a message at time t = 0 according to Protocol with the choice of d = do. At a certain 
even time T > 0, the maximum likelihood estimate of v* given a snapshot of the infected subtree Gt is 


arg max 
vGGt\'^t 


do 

dy 


n 

v' ^P{vt ,v)\{vt ,' i ^} 


do — 1 

dyf 1 


(14) 


where vt is the (Jordan) center of the infected subtree Gt, P{vt, v) is the unique shortest path from vt to v, and d„' is the 
degree of node v'. 

To understand this proposition, consider Eigure which was spread using adaptive diffusion (Protocol [T} with a choice 
of do = 2. Then Equation ( [T4| ) can be computed easily for each node, giving [1/2,1,0,1, 2/3,1/2,1/2,1/4] for nodes 
[1, 2,3,4, 5,6, 7,8], respectively. Hence, nodes 2 and 4 are most likely. Intuitively, nodes whose path to the center have small de¬ 
grees are more likely. However, if we repeat this estimation assuming do = 4, then Equation ( [T4| ) gives [3, 2, 0,2,4/3, 3, 3,3/2]. 
In this case, nodes 1, 6, and 7 are most likely. When do is large, adaptive diffusion tends to place the source closer to the 
leaves of the infected subtree, so leaf nodes are more likely to have been the source. 



Eig. 3; Irregular tree G4 with virtual source 04. 


Proof of Proposition 3.4 


We first make two observations: (a) Over regular trees, P(Gt|u) = P(GtI'U’) for any u w G 
Gt, even if they are different distances from the virtual souce. (5) Part Byg is identical for regular and irregular graphs, as 
long as the distance from the candidate source node to vt is the same in both, and the same do is used to compute a^git, h). 
That is, let Gt denote an infected subtree over an irregular tree network, with virtual source vt, and Gt will denote a regular 
infected subtree with virtual source vt- For candidate sources vq G Gt and vq G Gt, if Sh{vt,vo) = 5h{vt,vo) = h, then 
Byg = Byg. So to find the likelihood of vq G Gt, we can solve for Byg using the likelihood of uo G Gt, and compute Ayg 
using the degree information of every node in the infected, irregular subgraph. 

To solve for Byg, note that over regular graphs. Ay = l/(do (do — iyn(v,VT)-i^^ where do is the degree of the regular 
graph. If G is a regular tree. Equation ( [T3] l still applies. Critically, for regular trees, the adg{t, hfs are designed such that the 
likelihood of each node being the true source is equal. Hence, 


is a constant that does not depend on vq- This gives Byg oc (do — Prom observation (b), we have that Byg = Byg. 

Thus we get that for a wq S Gt \ {tiT}, 


P(Gt|Uo) = Ayg Byg 

(do - 

OC - —— - 

1 \.v' ^ P (^yj. ,t}o )\{'^0 

After scaling appropriately and noting that \P{vt,vo)\ = Sh{vt,vo) + 1, this gives the formula in Equation ([T^. ■ 

We provide an efficient message passing algorithm for computing the ML estimate in Equation ([T^, which is naturally 
distributed. We then use this estimator to simulate message spreading for random irregular trees and show that when do 
exceeds a threshold (determined by the degree distribution), obfuscation is not too sensitive to the choice of do. 

Ayg can be computed efficiently for irregular graphs with a simple message-passing algorithm. In this algorithm, each node 
V multiplies its degree information by a cumulative likelihood that gets passed from the virtual source to the leaves. Thus 
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Algorithm 2 Implementation of ML estimator in ([T4ll 

Input: infected network Gt = {Vt, Et), virtual source vt, time T, the spreading model parameter do 
Output: argmax-gp^ P(Gt|^* = v) 

1 : Po = ¥{GT\i* =i). 

2: Pyrp i — 0 

3: A{j ^ 1 for V gVt\ {ut} 

4: Aory i — 0 

5: A 3— Degree Message(Gr, vt, vt, A) 

6 : P(G'r|w/ea/) ^ ^ V/ 2-1 11 t<T (1 “ Q^do (i, ^))} 

7: for all u G Vt \ {^t} do 
8 : h G- Sff(v, Vt) 

9: By G- P(GT|w/ea/) ’ ^0 ' (^0 ” 1)^“^ 

10 : Bo — Ao • Bo 

return argmax-gp^Pc 

11 : procedure Degree Message(Gt, u, v. A) 

12 : for all w G N{v) \ { u } do 

13: if T = M then 

14: Aoo i Ao jdoo 

15: Degree Message(GT, v, w, A) 

16 : else 

17: if V is not a leaf then 

18: Aoo 3 Ao ' do/{^doo ' i^do 1)) 

19: Degree Message(GT, v, w, A) 

return A 


if there are JVt infected nodes in Gt, then Aog for every Do G Gt can be computed by passing 0{Nt) messages. This 
message-passing is outlined in procedure ‘Degree Message’ of Algorithm For example, consider computing A 5 for the 
graph in FigureThe virtual source tt = 3 starts by setting A 2 = |, A 4 = and A 5 = This gives A 5 , but to compute 
other other values of Aoo, the message passing continues. Each of the nodes v G N{3) in turn sets for their children 
w G N{v); this is done by dividing Ao by dw and replacing the factor of ^ in Ao with . For example, node 5 would 
set Ay = ^ • |. This step is applied recursively until reaching the leaves. 

As discussed earlier, Boy only depends on do and Sh{vt,vo). If uieaf G Gt is a leaf node and G is a regular tree, we get 


P(GT|wieaf) 


1 

do(do-ir/2-i 


t<T 
t even 


^"kaf 


(16) 


If To is ft. < T/2 hops from vt, then for node vq with 6 h{vo,vt) = h < T/2 over a regular tree, 

P(GTl'yo) = IP(GTkleaf) = 

Finally, Bo„ = By^. So to solve for P 5 in our example, we compute P(GT|u/ea/) for a 3-regular graph at time T = 4. This gives 
nG^lvieaf) =Ay^^^^ ■ By^^^^ = i • (1 - a 3 ( 2 , 1)) = |. Thus Bo = P(G 4 |uiea/) ' do ■ {do - = P(G 4 |uiea/) ' 3 • (2)° = i. 

This gives P(G 4 | 5 ) = A^- Bo = The same can be done for other nodes in the graph to find the maximum likelihood source 
estimate. 


Simulation studies. We tested adaptive diffusion over random trees in which each node’s degree was drawn i.i.d. from a fixed 
distribution. Figure [^illustrates simulation results for random trees in which each node has degree 3 or 4 with equal probability, 
averaged over 100,000 trials. By the law of large numbers, the number of nodes infected scales as Nt ~ E[Z1— 1]^/^ = 2.5^(^, 
where D represents the degree distribution of the underlying random irregular tree. The value of do corresponds to a regular 
tree with size scaling as {do — 1)^^^. Hence, one can expect that for do ~ 1 < 2.5, the source is likely to be in the center of 
the infection, and for do > 2.45 the source is likely to be at the boundary of the infection. Since the number of nodes in the 
boundary is exponentially larger than the number of nodes in the center, the detection probability is lower for do — 1 > 2.5. 
This is illustrated in Figure [^ which matches our prediction. In general, choosing do — 1-1- [E[Z1 — 1]] provides the best 
obfuscation, and it is robust for values above that. In this plot, data points represent successive even timesteps; their uniform 
spacing on the (log-scale) horizontal axis implies the message is spreading exponentially quickly. 














II 



Fig. 4; The probability of detection by the maximum likelihood estimator depends on the assumed degree do; the source cannot 
hide well below a threshold value of do. 



Fig. 5: Adaptive diffusion no longer provides perfect obfuscation for highly irregular graphs. 


Figure illustrates the probability of detection as a function of infection size while varying the degree distribution of the 
underlying tree. The notation (3, 5) => (0.5, 0.5) in the legend indicates that each node in the tree has degree 3 or 5, each with 
probability 0.5. For each distribution tested, we chose do to be the maximum degree of each degree distribution. The average 
size of infection scales as Nt ^ — 1]^ as expected, whereas the probability of detection scales as (dmin — 1)”^ = 2“^, 

which is independent of the degree distribution. This suggests that adaptive diffusion fails to provide near-perfect obfuscation 
when the underlying graph is irregular, and the gap increases with the irregularity of the graph. In the next section, we quantify 
this gap, and gain intuition about how to reduce it. 


Probability of detection. In this section, we provide the probability of detection for adaptive diffusion over trees whose node 
degrees are drawn i.i.d. from some distribution D, for do = oo. However, we cannot exactly use the ML estimator from 
Equation 14 which assumes the infinite irregular tree G is given, and the source v* is chosen randomly from the nodes of G. 
Equation 14 is the correct ML estimator in any practical scenario, but analyzing the probability of detection under this model 


requires a prior on the (infinitely many) nodes of G. We therefore consider a closely-related random process, in which we fix 
a source v* and generate G (and consequently, Gy) on-the-fly. Specifically, at time t = 0, v* draws a degree from D, and 
generates (i„- child nodes. The source picks one of these neighbors uniformly at random to be the new virtual source. Each time 
a node v is infected according to Protocol [T] v draws its degree dy from D, then generates dy — 1 child nodes. Eor example, as 
soon as V 2 , neighbor of v*, receives the virtual source token, it draws its degree from D and generates dy^ — 1 children. The 
structure of the underlying, infinite contact network G is independent of Gt conditioned on the uninfected neighbors of the 





12 


leaves of Gt, and need not be considered. The adversary observes Qt, which is an unlabeled snapshot including Gt and its 
uninfected neighbors. We have that P(Dmap = v*\T) = IP(^^T|’r)P(wMAP = v*\Gt)- We first consider P(wml = v*\Gt)- 
1) Probability of Detection Given a Snapshot: The adversary observes this random process at time T (i.e., it observes Gt, 
knowing that the interior Gt are the infected nodes), and estimates one of the leaf node as an estimate of the true source 
which started the random process. The following theorem analyzes the probability of detecting the true source for any estimate 
V, given a snapshot Gt- 

Theorem 3.5: Under the above described random process of adaptive diffusion, an adversary observes the snapshot Gt at 
an even time T > 0 and estimates v G OGt- For any estimator v, the conditional probability of detection is 


{v = V*\Gt) = ^ 
dv 


n 


1 


W^(f){v,VT) 


{dw - 1) 


(17) 


where vt is the center of Gt, 4>{v,vt) is the (unique) path from v to vt, Gt is the interior of Gt which is the infected 
sub-tree, and BGt is the set of leaves of Gt- 

A proof is provided in Section VIII-D Intuitively, Equation ([T7]i is the probability that the virtual source starting from v 


ends up at vt (up to some constant factor for normalization). This gives a simple rule for the adversary to achieve the best 
detection probability by computing the MAP estimate: 


A(r) 

“^MAP 


G argmax =v*\Gt) - 

V 

Corollary 3.6: Under the hypotheses of Theorem 3.5 the MAP estimator in ( fTS) ! can be computed as 

n ’ 


-(T) 

'^MAP 


= arg mm 
vGOGt 


wG(p{v ,Vt) 


achieving a conditional probability of detection 

= v*\Gt) = 


'P’(^MAP 


1 


max , 
vGOGt d. 


Vt n {d-iT 1) 

wCl4>{v ,Vt) 

\{vt,v} 


(18) 


(19) 


( 20 ) 


When applied to regular trees, this recovers known results of ll42l . which confirms that adaptive diffusion provides strong 


Mx = 


( 21 ) 


anonymity guarantees under d-regular trees. But more importantly. Corollary 3.6 characterizes how the anonymity guarantee 
depends on the general topology of the snapshot. We illustrate this in two extreme examples: a regular tree and an extreme 
example in Figure 

For a d-regular tree, where all nodes have the same degree, the size of infection at even time T is the number of nodes in 
a d-regular tree of depth T /2: 

d ( d - 1)^/2 2 

d-2 ^d-2 ' 

To achieve a perfect obfuscation, we want the probability of detection to decay as l/N^. We can apply Corollary |3.6| to this 
d-regular tree and show the probability of detection is {{d— l)/d){d— 1)“^/^), which recovers one of the known results in ll42l 
Proposition 2.2]. This confirms that adaptive diffusion achieves near-perfect obfuscation, up to a small factor of {d—l)/{d — 2). 

On the other hand, when there exists a path to a leaf node consisting of low-degree nodes, adaptive diffusion can be sub- 
optimal, and the gap to optimality can be made arbitrarily large. Figure illustrates such an example. This is a tree where 
all nodes have the same degree <7 = 5, except for those nodes along the path from the center vt to a leaf node v, including 
Vt and excluding v. The center vt has degree two and the nodes in the path have degree three. Hence, the shaded triangles 
indicate d-regular sub-trees of appropriate heights. The size of this infection is Nt = {{d — /{d — 2)^) (1 -f o(l)). 

shows that 


3.6 


Ideally, one might hope to achieve a probability of detection that scales as l/{d— 1)^^^. However, Corollary 
the adaptive diffusion achieves probability of detection 1/2^/^, with the leaf node v achieving this maximum in Equation ( |20l l. 
Hence, there is a multiplicative gap of ((d — l)/2)^/^. By increasing d, the gap can be made arbitrarily large. On the other 
hand, such an extreme topology is rare under the i.i.d tree model. 

2 ) Concentration of Probability of Detection: Depending on the topology, adaptive diffusion can be significantly sub- 
optimal. A natural question is “what is the typical topology of a graph resulting from the random tree model?” Under the 
model introduced previously, we give a concrete answer. Perhaps surprisingly, this typical topology can be characterized by 
solving a simple convex optimization. 

We are interested in the following extremal value 


A, 


Gt 


- dq, 


, mm 
uGOGt 


n -1) 


( 22 ) 


W^(f){v,VT) 

\{vt,v} 
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Fig. 6: An example of a snapshot emphasizing the sub-optimality of adaptive diffusion. 


which captures the topology of the snapshot. We want to characterize the typical value of this function over random tree Gt 
resulting from the adaptive diffusion process. 

Observe that the distribution of the balanced tree Gt follows a simple branching process known as Gabon-Watson process. 
This is because Gt resulting from adaptive diffusion has the same distribution, independent of the location of the source v*. We 
consider a given degree distribution D. We use D to denote both a random variable and its distribution—the distinction should 
be clear from context. The random variable D has support / = (/i,..., /^) associated with probability p = (pi,... ,p^) such 
that the degree of node v is i.i.d. with 

{ /i with probability pi , 

: : (23) 

fjj with probability , 


where 2 < fi < f 2 <■■■< fri are integers and the positive p^’s sum to one. We also assume D's support set has at least two 
elements, i.e., 77 > 2. 

Note that the adaptive diffusion always passes the virtual source token to a uniformly-chosen neighbor. It is straightforward 
to show that adaptive diffusion starting from a leaf node v* has the same distribution over graphs as the following branching 
process, denoted Gt- at time T — 0 a root node, which we denote as the virtual source vt, creates D offspring. At each 
subsequent even time step, each leaf node in Gt creates new offspring independently according to Z? — 1 (where we subtract 
one because each leaf is already connected to its parent). This process is repeated until time step T, which generates a random 
tree Gt- More precisely, the two branching processes are equal in distribution: Gt=Gt- This can be seen by observing that 
conditioned on the path of nodes cj){v*,VT), the branching processes are identical. Since the node degrees in this path are 
drawn independently, the path is equally distributed whether it starts from the virtual source vt or the leaf node v*. 

The following theorem provides a concentration inequality on the extremal quantity Aa-r, which in turn determines the 


probability of detection as provided by Corollary 3.6 


P(dp = ^*|GT) = 


1 


Ar 


(24) 


Theorem 3.7: For an even T > 0, suppose a random tree Gt is generated from the root vt according to the Gabon-Watson 
process with i.i.d. degree distribution D, where f and p are defined as in ( |2^ , then the following results hold: 

(a) If piifi — 1) > 1, for any positive <5 > 0, there exists positive constants Gd ,5 and C'j^ g that depend only on the degree 
distribution and the choice of 6 such that 


/ log(AG^) 

V T/2 


log(/i - 1) 


>6] < 


„-Cr 


(25) 


for an even time T > G'jy g. 

(b) If pi{fi — 1) < 1, define the mean number of children: 


n 

TD = - 1 ) , 

2=1 


and the set 


'R-d = { r e I \og{pD) > DKL{r\\f3) } , 


(26) 
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where denotes the ry-dimensional probability simplex, -Dkl('II') denotes Kullback-Leibler divergence, and /3 is a 
length -77 probability vector in which j3i = Pi{fi — 1 )/iid- Further, define r* as follows: 

r* = argmin ( r , log(/- 1 ) ) , 
rGTlD 


where (r,log(/ — 1)) = log (/i — 1). Then for any (5 > 0, there exists positive constants Cd ,5 and ^ that 

only depend on the degree distribution D and the choice of (5 > 0 such that 


/ log(AG^) 

V T/2 


(r*,log(/-l)) 





(28) 


for an even time T > C'jj g. 

The results in parts (a) and (b) can be merged, in the sense that the solution of ( |27l l is r* = [1, 0,..., 0] when pi{fi — 1) > 1. 
A proof of this theorem is provided in Section |VIII-E| Putting it together with ( |24l i, it follows that the probability of detection 
concentrates around 

-;|log(P(t)MAP =^*)) - . log(/- 1)) , 

in case ( 6 ) and around log(/i — 1) in case (a). Here ~ indicates concentration for large enough T. We want to emphasize 
that r* can be computed using off-the-shelf optimization tools, since the program in ( |27| ) is a convex program of dimension 
77 . This follows from the fact that the objective is linear in r and the feasible region is convex since KL divergence is convex 
in r. 

For example, if Z? is 3 w.p. 0.7 or 4 w.p. 0.3, then this falls under case (a). The theorem predicts the probability of detection 
to decay as (3 — On the other hand, if 


(2 with probability 0.3 
^ 3 with probability 0.7 ’ 

then this falls under case ( 6 ) with pG = 1.7, /3i = 0.3/1.7, and P 2 = 1.4/1.7. In this case, the exponent is a solution of the 
following optimization for r = [r, 1 — r]: 


minimize 

rGR 

subject to 


rlog 1 -I- (1 — r) log 2 

1 7r 1 7(1 - r) 

r log — + (!-?') log- — - < log(1.7) 

re [ 0 , 1 ] 


It follows that the optimal solution is r* ~ [0.64, 0.36] and the probability of detection decays as Figure]^ 

confirms this prediction with simulations for these examples. 

Theorem 3.7 provides a simple convex program that computes the probability of detection for any degree distribution. For 
random trees, this quantifies the gap between what adaptive diffusion can guarantee and the perfect obfuscation one desires. 
We define the rescaled log-multiplicative gap as 


A_d 


— Inc ~ ^*) 

T ^ 1/E[|5Gt1] ’ 


where {OGtI is the total number of candidates in a snapshot. It is not difficult to show that E[|i9Gt 1] = it follows 

that Ag — logpD — (r*, log(/ — 1)). For example, Ajj = 0 for regular trees, and Ajj = log 2 2.3 — log 2 2 = 0.20 for the first 
example under case (a) and Ad = log 2 1.7 — 0.36 = 0.41 for the second example under case (b). 


Simulation studies. Figure [7] empirically checks the predictions in Theorems 3.5 and |3.7| The distribution with support 
f = (3,4) with probabilities p = (0.5,0.5) addresses case 1 from the theorem, where pi(/i — 1) > 1. The distribution with 
support f = (2,3) with probabilities p = (0.3, 0.7) addresses case 2, where pi{fi — 1) < 1. In both examples, we observe 
that the empirical log(P( 7 ) = v*))/{T/2) converges to the theoretical value predicted in Figure |7] However, this convergence 
may be slow, and the timestep duration of these experiments was limited by computational considerations since the graph size 
grows exponentially in time. 

3) Preferential Attachment : Our analysis reveals that adaptive diffusion can be significantly sub-optimal, when the under¬ 
lying graph degrees are highly irregular. To bridge this gap, we introduce a family of protocols we call Preferential Attachment 
Adaptive Diffusion (PAAD). We analyze the performance of PAAD and provide numerical simulations showing that PAAD 
improves over adaptive diffusion when degrees are irregular. 

The reason for this gap is that in typical random trees, there are nodes that are significantly more likely to be the source, 
compared to other typical candidate nodes. To achieve near-perfect obfuscation, we want all candidate nodes to have similar 
posterior probabilities of being the source. To balance the posterior probabilities of leaf nodes, we suggest passing the virtual 
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Fig. 7: Empirical verification of Theorems 


3.5 


and 


3.7 


We observe that the probability of detection converges in time to the 


predicted values, which depend only on the underlying degree distribution. 


source with higher probability to high-degree nodes. We propose a family of protocols based on this idea, and make this 
intuition precise in Theorem 1 3. 8 1 

PAAD is based on adaptive diffusion, but we modify how virtual sources are chosen. We parametrize this family of protocols 
by a non-negative integer g. When a new virtual source is to be chosen, instead of choosing uniformly among its neighbors 
(except for the previous virtual source), the new virtual source is selected with probability weighted by the size of its p-hop 
neighborhood. Let JVg{v) denote the set of 5 -hop neighbors of node v, and let JVg{v,w) denote the same set, removing any 
nodes z for which w G 4){z, v), where 0(z, v) denotes the path between z and v. Then for instance, if 5 = 1, then each time 
the virtual source is passed from vt to vt+ 2 , it is passed to a neighbor w G Afi{vT,VT- 2 ) with probability proportional to 

d'uj 1 . 

V{vt +2 =w)= ' . 


For general 5 , the probability is proportional to the size of the candidate w’s 5 -hop local neighborhood, excluding those in 
the direction of the current virtual source vt- Each virtual source vt chooses the next virtual source as follows; for any node 

W G Ni{vT: VT-2), 


P{vT+2 =W) = 


|Wg(w,?;T)| 


E 


W' ■,Vt- 2 \ 


\Afg{w',VT)\ 


PAAD encourages the virtual source to traverse high-degree nodes. This balances the posterior probabilities, by strengthening 
the probability of leaf nodes whose path contain high-degree nodes, while weakening those with low-degree nodes. 

This intuition is made precise in the following theorem, which analyzes the probability of detection for a given snapshot. 
Dehne the probability that the sequence of decisions on choosing the virtual sources results in the path from a source v to the 
current virtual source vt as Q(Gt,v) = Tlt^i ^^’(^24 = Wt) , where 

4>{v,vt) = (wq = v,wi,W 2 , ■ ■ ■ ,wt 12 -i,wt /2 = Vt)■ The specihc probability depends on the choice of 5 and the topology 
of the underlying tree. Note that the progression of the virtual source now depends on 5 -hop neighborhood, and we therefore 
define Qt to include the current infected subgraph Gt and its (5 -I- l)-hop neighborhood. 

Theorem 3.8: Suppose a node v* starts to spread a message at time t = 0 according to PAAD, where the underlying irregular 
tree is generated according to the random branching process described in Section |III-B| At a certain even time T > 0, an 
adversary observes the snapshot of the infected subtree Qt and computes a MAP estimate of the source v*. Then, the following 
results hold: 

(a) The MAP estimator is 

i’MAP = arg max dyQ{QT,v) (29) 

vGOGt 

where OGt denotes the leaves of Gt- 

(b) The conditional probability of detection achieved by the MAP estimator is 


ma.Xy(zoo.y dy Q{Qt, v) 
’^w^OGt Q{St , w) 


]P(^MAP = V*\Qt) 


(30) 
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Fig. 8; Probability of detection of regular adaptive diffusion compared to 1-, 2-, and 3-hop preferential attachment adaptive 
diffusion (PAAD). 
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Fig. 9: Ratio of observed probability of detection to lower-bound probability of detection, for a range of degree distributions. 
PAAD has better anonymity properties than regular adaptive diffusion over random, irregular trees. 


The proof relies on the techniques developed for Theorem and is omitted due to space limitation. The example from 
Figure illustrates the power of PAAD. For this class of snapshots, it is straightforward to show that under adaptive diffusion, 
= 2“^/^, whereas under 1-hop PAAD, 


tjPAAD ^ 


2 


Notice from these expressions that scales as {d— which achieves perfect obfuscation, whereas regular adaptive 

diffusion scales as 2“^/^. 

This shows that there exist snapshots where PAAD signihcantly improves over adaptive diffusion. However, such examples 
are rare under the random tree model, and there are also examples of snapshots where adaptive diffusion can achieve a better 
obfuscation than PAAD. To complete the analysis, we would like to show the analog of Theorem 3.7 for PAAD. However, the 
observed snapshot is no longer generated by a standard Galton-Watson branching process, due to the preferential attachment. 
The analysis techniques developed for Theorem 3.7 do not generalize, and new techniques seem to be needed for a technical 
analysis. This is outside the scope of this manuscript, but we show simulations suggesting that PAAD improves over adaptive 
diffusion. 


Simulation studies. PAAD requires each virtual source to know some information about its local neighborhood on the contact 
network; in exchange, we observe empirically that it hides the source better than traditional adaptive diffusion. Figure shows 
the probability of detection over graphs with a degree distribution of support / = (2,5) with probability p = (0.5,0.5). The 
results are averaged over 10,000 realizations of the random graph and the spreading sequence. This plot shows empirically that 
preferential attachment adaptive diffusion exhibits better hiding properties than regular adaptive diffusion, and that the beneht 
of preferential attachment increases with the size of the neighborhood considered for preferential attachment (e.g., one-hop vs. 
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Fig. 10; Grid adaptive diffusion spreading pattern. 


two-hop). Notice that our lower bound on probability of detection is 1/|9Gt| rather than 1/Nt, as in 1421 : this is because we 
constrain the source to always be at one of the leaves of the graph, so 1/|9Gt| lower bounds the probability of detection. 

Figure computes the ratio of the observed probability of detection to a lower bound on the probability of detection (i.e., 
1/|i9Gt|), for both adaptive diffusion (AD) and one-hop PAAD. Empirically, we observe that the advantage of PAAD is greater 
when the degree distribution is more imbalanced (i.e., when /max — /min is large). 


C. General Graphs 

In this section, we demonstrate how adaptive diffusion fares over graphs that involve cycles, irregular degrees, and finite 
graph size. We provide theoretical guarantees for the special case of two-dimensional grid graphs, and we show simulated 
results over a social graph dataset. 

1) Grid graphs: Here, we derive the optimal parameters a{t, h) for spreading with adaptive diffusion over an infinite grid 
graph, defined as the graph Cartesian product of two infinite line graphs. This example highlights challenges associated with 
spreading over cyclic graphs, while still providing a regular, symmetric structure. To spread over grids, we make some changes 
to the adaptive diffusion protocol, outlined in Protocol (grid adaptive diffusion). 

First, standard adaptive diffusion requires the virtual source to know its distance from the true source. Over trees, this 
information was transmitted by passing a distance counter, ht, that was incremented each time the virtual source changed; since 
the network was a tree, this distance from the source was non-decreasing as long as the virtual source was non-backtracking. 
However, on a cyclic graph (e.g., a grid), the virtual source’s non-backtracking random walk could actually cause its distance 
from the true source to decrease with time. We wish to avoid this to preserve adaptive diffusion’s anonymity guarantees. 

Therefore, instead of passing the raw hop distance ht to each new virtual source, grid adaptive diffusion passes directional 
coordinates (, hY ) detailing the virtual source’s horizontal and vertical displacement from the source, respectively. For 


example, in Figure 10 the virtual source V 4 would receive parameters {ht ,ht) = (—1,1) because it is one hop west and 


one hop north of the true source. This indexing assumes some notion of directionality over the underlying contact network; 
nodes should know whether they received a message from the north, south, east, or west. If a virtual source chooses to move, 
it always passes the token to a node that is further away from the true source, i.e. > \h^\ + 

To maintain symmetry about the virtual source, we also modify the message-passing algorithm. Just as in adaptive diffusion 
over trees, when a new virtual source sends out branching messages, it sends them in every direction except that of the old 
virtual source. However, unlike adaptive diffusion over trees, each branch message has up to two “forbidden" directions; the 
direction of the previous virtual source, and the direction of the node that originated the branching message (these might be 
the same). Thus, if a branch message is sent west, and the previous virtual source was south of the current virtual source, 
each node would only propagate the message west and/or north. Whenever a node receives a branch message and its neighbors 
are not all infected, it infects all uninfected neighbors. As in adaptive diffusion over trees, two waves of directional branching 
messages are sent each time the virtual source moves, in every direction except that of the old virtual source. If the virtual 
source instead chooses to stay fixed, then the same rules hold, except the new virtual source only sends one wave of branch 
messages, symmetrically in every direction. 

Given the spreading protocol, we can choose a{t, h) to give optimal hiding: 

t-2{h-l) 


a{t, h) = 


i + 4 


(31) 


Under these conditions, the following result shows that we achieve perfect obfuscation, i.e. P('fML = y*) = 1//Vt + o{\/Nt)- 
Proposition 3.9: Suppose the contact network is an infinite grid, and one node v* in G starts to spread a message according 
to Protocol 1^ (grid adaptive diffusion) at time t = 0, with a{t, h) chosen according to Equation ( [3T] i. At a certain time T > 0 
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an adversary estimates the location of the source v* 
hold for Protocol |5j 

(a) the number of infected nodes at time T is 


using the maximum likelihood estimator vml- The following properties 




(r + i)2 


(32) 


(33) 


(b) the probability of source detection for the maximum likelihood estimator at time T is 

P(UML =V )< (y 

(Proof in Section |VIII-F| i 

The baseline infection rate for deterministic, symmetric spreading is Nt = + (T + 1)^. Grid adaptive diffusion infection 

rate is within a constant factor of this maximum possible rate, and it achieves perfect obfuscation over grid graphs. The price 
to pay for this non-tree graph is that (a) a signihcant amount of metadata needs to be transmitted to coordinate the spread— 
particularly with respect to the directionality of messages; and (6) the position of the nodes w.r.t. a global reference needs 
to be known. Hence, the current implementation of the grid adaptive diffusion has a limited scope, and it remains an open 
question how to avoid such requirements for grids and still achieve a perfect obfuscation. 

2) Real-world social graphs: In this section, we provide simulation results from running adaptive diffusion over an underlying 
connectivity network of 10,000 Facebook users, as described by the Facebook WOSN dataset ll43l . We eliminated all nodes 
with fewer than three friends (this approach is taken by several existing anonymous applications so users cannot guess which 
of their friends originated the message), which left us with a network of 9,502 users. 

Over this underlying network, we selected a node uniformly at random as the rumor source, and spread the message using 
adaptive diffusion for trees. We did not use grid adaptive diffusion because Protocol assumes the underlying graph has a 
symmetric structure with a global notion of directionality, whereas the tree-based adaptive diffusion makes no such assumptions. 
We set do = oo, which means that the virtual source is always passed to a new node (i.e., ad{t, h) = 0). This choice is to 
make the ML source estimation faster; other choices of do may outperform this naive choice. To preserve the symmetry of 
our constructed trees as much as possible, we constrained each infected node to infect a maximum of three other nodes in 
each timestep. We also give the adversary access to the undirected infection subtree that explicitly identifies all pairs of nodes 
for which one node spread the infection to the other. This subtree is overlaid on the underlying contact network, which is not 
necessarily tree-structured. We demonstrate in simulation (Figure [TT] i that even with this strong side information, the adversary 
can only identify the true message source with low probability. 

Using the naive method of enumerating every possible message trajectory, it is computationally expensive to hnd the exact 
ML source estimate since there are 2^ possible trajectories, depending on whether the virtual source stayed or moved at each 
timestep. If the true source is one of the leaves, we can closely approximate the ML estimate among all leaf nodes, using the 
same procedure as described in III-B with one small modihcation: in graphs with cycles, the term (d„ — 1) from equation 


([T3J should be substituted with — 1), where (i“ denotes the number of uninfected neighbors of Vj^. at time jk- Loops 
in the graph cause this value to be time-varying, and also dependent on the location of vq, the candidate source. We did not 
approximate the ML estimate for non-leaves because the simplifications used in Section III-B to compute the likelihood no 
longer hold, leading to an exponential increase in the problem dimension. 

This approach is only an approximation of the ML estimate because the virtual source could move in a loop over the social 
graph (i.e., the same node could be the virtual source more than once, in nonadjacent timesteps). 

On average, adaptive diffusion reached 96 percent of the network within 10 timesteps using do = 4. We also computed 
the average distance of the true source from the estimated source over the infected subtree (Figure [T2) i. We see that as time 
progresses, so does the hop distance of the estimated source from the true source. In social networks, nearly everyone is 
within a small number of hops (say, 6 hops BH) from everyone else, so this computation is not as informative in this setting. 
However, it is relevant in location-based connectivity graphs, which can induce large hop distances between nodes. 


IV. SPY-BASED ADVERSARIAL MODEL 

The spy-based adversary collects more detailed information than the snapshot adversary, but only for a subset of network 
nodes. In this section, we provide some results stating that over d-regular trees, choosing ad{t,h) = 0 gives asymptotically 
optimal hiding in d. While the proofs for these results are not included in this paper (all proofs can be found in IIT4l l. the 
results are included for completeness. 

For the spy-based adversary, we model each node other than the source as a spy with probability p. At some point in time, 
the source node v* starts propagating its message over the graph according to some spreading protocol (e.g., diffusion or 
adaptive diffusion). Each spy node Si G V observes; (1) the time T^. (relative to an absolute reference) at which it receives the 
message, (2) the parent node . that relayed the message, and (3) any other metadata used by the spreading mechanism (such 
as control signaling in the message header). At some time, spies aggregate their observations; using the collected metadata 
and the structure of the underlying graph, the adversary estimates the author of the message, v. 
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Fig. 11: Near-ML probability of detection for the Facebook graph with adaptive diffusion. 



Fig. 12; Hop distance between true source and estimated source over infection subtree for adaptive diffusion over the Facebook 
graph. 


To define perfect obfuscation for this adversarial model, we hrst observe the following: 

Proposition 4.1 ( KM)} ]: Under a spy-based adversary, no spreading protocol can have a probability of detection less than p. 
This results from considering the first-spy estimator, which returns the parent of the hrst spy to observe the message. 
Regardless of spreading, this estimator returns the true source with probability at least p; with probability p, the hrst node 
(other than the true source) to see the message is a spy. 

We therefore say a protocol achieves perfect obfuscation against a spy-based adversary if the ML probability of detection 
conditioned on the spy probability p is bounded by 

]P(^ML == p-\-o(p^. (34) 

However, when the underlying graph is a d-regular tree, the probability of detection increases over time for standard diffusion 
spreading, since the estimator receives more information. Moreover, it is straightforward to show that the probability of detection 
tends to 1 as degree of the underlying graph d —oo: 

Proposition 4. 2 (Sa): Suppose the contact network is a regular tree with degree d. There is a source node v*, and each 
node other than the source is chosen to be a spy node i.i.d. with probability p as described in the spy model. In each 
timestep, each infected node infects each uninfected neighbor independently with probability q. Then the probability of detection 

]P(?)ml = f*) > 1 - (1 - 

This bound implies that as degree increases, the probability of detecting the true source of diffusion approaches 1. The 
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proposition also results from analyzing the first-spy estimator. These observations suggest that diffusion provides poor anonymity 
guarantees in real networks; contact networks may be high degree, and the adversary is not time-constrained. 


A. Main result (Spy-based adversary) 

In this section, we give results stating that over d-regular trees, adaptive diffusion with ad{t, h) = 0 achieves asymptotically 
perfect obfuscation in d. We also show that adaptive diffusion hides the source better than diffusion over d-regular trees, d > 2. 
However, these results depend on a slightly modihed implementation of adaptive diffusion, in which some additional metadata is 
passed around. This implementation, which we call the Tree Protocol, facilitates analysis and is also fully distributed, avoiding 
the explicit notion of a virtual source. 


Tree Protocol. The spreading protocol follows Algorithm 1 (Spreading on a tree) from US; the goal is to build an infected 
subtree with the true source at one of the leaves. Whenever a node v passes a message to node w, it includes three pieces of 
metadata: ( 1 ) the parent node = v, ( 2 ) a binary direction indicator G {t) i}^ (3) the node’s level in the infected 

subtree rriu] S N. The parent p^ is the node that relayed the message to w. The direction bit Uw flags whether node w is a 
spine node, responsible for increasing the depth of the infected subtree. The level rriw describes the hop distance from w to 
the nearest leaf node in the final infected subtree, as t ^ oo. 

At time t — 0, the source chooses a neighbor uniformly at random (e.g., node 1) and passes the message and metadata 
(pi = 0, ui =t) = !)■ Figure 13 illustrates an example spread, in which node 0 passes the message to node 1. Yellow 

denotes spine nodes, which receive the message with =t> and gray denotes those that receive it with —i- Whenever 
a node w receives a message, there are two cases, if Uw =t. node w chooses another neighbor z uniformly at random and 
forwards the message with ‘up’ metadata: {p^ = w, Uz =t) = ’m-w + !)■ All of w’s remaining neighbors z' receive the 

message with ‘down’ metadata: {pz' = w, Uz' =)., Wz' = Wtu ~ !)• For instance, in Figure [13] node 1 passes the ‘up’ message 
to node 2 and the ‘down’ message to node 3. On the other hand, if Uw =i and > 0, node w forwards the message to 
all its remaining neighbors with ‘down’ metadata: [pz = w, Uz =),, ruz = rriw — !)■ If a node receives m-w = 0 , it does not 
forward the message further. Algorithm describes this process more precisely. 

Observe that adaptive diffusion ensures that the infected subgraph is a balanced tree with the true source at one of the 
leaves. Moreover, unlike regular diffusion, the message does not reach all the nodes in the network under adaptive diffusion 
(even when T = oo). Even though this may seem like a fundamental drawback for adaptive diffusion, it can be shown that the 
infected subgraph has a size proportional to (d— 1)^/^ on regular trees (compared to (d— 1)^ under regular diffusion). More 
critically, real social networks have cycles, so neighbors of nodes with = 0 can still get the message from other nodes in 
the network iH. 

As before, this protocol ensures that the infected subgraph is a symmetric tree with the true source at one of the leaves. The 
key difference between Protocol(naive adaptive diffusion) with ad{t, h) = 0 and Protocol [^ (Tree Protocol) is that the latter 
does not rely on message-passing from the virtual source to control spreading. Instead, it passes enough control information 
to realize the same spreading pattern in a fully-distributed fashion. 


Protocol 3 Tree Protocol _ 

Input: contact network G = {V,E), source v*, time T 
Output: infected subgraph Gt = (Vt,Et) 

1: Vo ^ {t;*} 

2: m„. ^ 0 and 

3: V* selects one of its neighbors w at random 
4: Vi ^ Vb U {w} 

5: TOu, 1 and 
6: t i — 2 

7 : for t <T do 

8: for all V G Vt-i with uninfected neighbors and rriy > 0 do 

9: if Uy =t then 

10: V selects one of its uninfected neighbors w at random 

11 : Vt^Vt-iU{w} 

12: niyj G- my, -\-1 and Uy, •(—f 

13: for all uninfected neighboring nodes z of u do 

14: Vt G- Vt-i U {z} 

15: Uz and TOz ^ my — 1 

16: t i — f -f 1 


In the spy-based adversarial model, each spy Si in the network observes any received messages, the associated metadata, 
and a timestamp T^.. Figure 14 illustrates the information observed by each spy node, where spies are outlined in red. 








21 



Fig. 13: Message spread using the tree protocol from ll45ll . 



Fig. 14: The 
absolute, but 


information observed by the spy nodes 3, 7, and 8 for the spread in Figure 13 
they need not be. 


Timestamps in this figure are 


Source Estimation. The ML source-estimation algorithm for this spreading and adversarial model is described in lfT4l . The 
ML estimation algorithm is not necessary to understand this paper’s primary contributions. We include it in this section for 
completeness, and because the probability of detection for the spy-tsnapshot adversarial model in Section [V] uses terminology 
that is introduced in this estimator. 

To a snapshot adversary, all leaves in the infected subgraph have the same likelihood. Because adaptive diffusion has 
deterministic timing, spies only help the estimator discard candidate nodes. We assume the message spreads for infinite time. 
There is at least one spy on the spine; consider the first such spy to receive the message, sq- This spine spy (along with its 
parent and level metadata) allows the estimator to specify feasible subtree in which the true source must lie. In Figure [T3] 
node 8 is on the spine with level mg = 4, so the feasible subtree is rooted at node 5 and contains all the pictured nodes except 
node 8 (9’s children and grandchildren also belong, but are not pictured). Spies outside the feasible subtree do not influence 
the estimator, because their information is independent of the source conditioned on sq’s metadata. Only leaves of the feasible 
subtree could have been the source—e.g., nodes 0, 3, 6, and 7, as well as 9’s grandchildren. 

The estimator then uses spies within the feasible subtree to prune out candidates. The goal is to identify nodes in the feasible 
subtree that are on the spine and close to the source. For each spy in the feasible subtree, there exists a unique path to the 
spine spy sq, and at least one node on that path is on the spine; the spies’ metadata reveals the identity and level of the spine 
node on that path with the lowest level—we call this node a pivot (details in Algorithm |^. For instance, in Figure we can 
use spies 7 and 8 to learn that node 2 is a pivot with level m 2 = 2. Estimation hinges on the minimum-level pivot across all 
spy nodes, fmm- In the example, imin = 1, since spies 3 and 8 identify node 1 as a pivot with level mi = 1. The true source 
must lie in a subtree rooted at a neighbor of imim with no spies. In our example, this leaves only node 0, the true source. 
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Protocol 4 ML Source Estimator for Algorithm 


Input: contact network G = {V, E), spy nodes S = {sq, si ■ ■ •} 
Output; ML source estimate uml 
1 : Let So denote the lowest-level spine spy, with metadata 
2 : V^{veV: Sh{v,so) < rriso andp^^ e V{v,so)} 

3: E ^ {{u,v) : {u,v) G E and u,v gV} 

4: Define the feasible subgraph as E{V,E) 

5: L ^ 0 
6- 

7 : for all s G S' with s G L do 


8 : Let 


_ 1 

■ 1 -1 ■ 


’ \P{s,so)\ ' 



2 

1 1 


. ^^0 - . 


= h 


u G V{s,so) : Snisjis 
ks ^ V G V{s, So) : 5 h{s, kg) = - 1 

LU {4} 

K ^ KU{kg} 

Find the lowest-level pivot: imin G- argmin^g^r 
C /^0 

for all w G L where u is a leaf in E{V, E) do 
if VivJrnin) H AT = 0 then 
U ^UU{v} 

return Dml, drawn uniformly from U 


S.ls 


and metadata Si : {pg -, mg., Ug ^) 

5 , Usq ). 


\> Set of feasible pivots 
> Set of eliminated pivot neighbors 


> Add pivot 
> Add pivot neighbor 

[> Candidate sources 


Anonymity properties. This ML estimation procedure can be analyzed to exactly compute the probability of detection for 
adaptive diffusion on a d-regular tree: 

Theorem 4.3 (H^): Suppose the contact network is a regular tree with degree d > 2. There is a source node v*, and each 
node other than the source is chosen to be a spy node i.i.d. with probability p as described in the spy model. Against colluding 
spies attempting to detect the location of the source, adaptive diffusion achieves the following: 

(a) The probability of detection is 

p(e„, = „-) = p+ —(35) 

where = {1 — (1 — _|_ 

(1 _p)((d-l)''+'-l)/(<i- 2 ), 

(b) The expected distance between the source and the estimate is bounded by 

OO 

lE[i5g(DML,u*)] >2y^k-rk (36) 

fc=i 


where \Td,k \ = 


(d-l)'=-l 


d-2 


and 


7’fc = - (1 ^ + {d- 1)(1 - 

(d- 2)(1 - 1^ 


There are two main observations to note regarding this result: 

(1) Asymptotically optimal probability of detection: As tree degree d increases, the probability of detection converges to 
the degree-independent fundamental limit in Proposition 4.1 i.e., P(L* = uml) = P- This is in contrast to diffusion, whose 
probability of detection tends to 1 asymptotically in d. 

(2) Expected hop distance asymptotically increasing: We observe empirically that for regular diffusion, (Oml, u*)] 
approaches 0 as d increases. On the other hand, for adaptive diffusion with a hxed p > 0 , as d —)■ oo, limsupE[d//( uml, i’*)] = 

2 ( 1 -p). 

These observations suggest that adaptive diffusion exhibits provably stronger anonymity properties than standard diffusion 
on regular trees—a suggestion that is backed up by simulations on irregular trees and the Facebook graph in iflTll . 


V. SPYH-SNAPSHOT ADVERSARIAL MODEL 

The spy-Hsnapshot adversarial model considers a natural combination of the snapshot and spy-based adversaries. At a certain 
time r, the adversary collects a snapshot of the infection pattern, Gt- It also collects metadata from all spies that have seen 
the message up to (and including) time T. Based on these two sets of metadata, the adversary infers the source. 
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d = 3 


d = 5 




Fig. 15; Probability of detection under the spy+snapshot adversarial model. As estimation time and tree degree increase, the 
effect of the snapshot on detection probability vanishes. 


Notably, this stronger model does not significantly impact the probability of detection as time increases. The snapshot helps 
detection when there are few spies by revealing which nodes are true leaves. This effect is most pronounced for small T and/or 
small p. The exact probability of detection at time T is given below; 


F(vml = V ) = 

(1 _p)l'Sd,T|-l 
|c/S'd,T| 


+ 


TI2 


_ —1) - 


(1 -p) 


\dTd,k 


^min spine node) is a spy 

(1 __ ,1 _[ + 


^min spine node) not a spy 

_ n"! ISrf.T | —{|r'd,*! + l | — ITd,); Dip ,, [ II(A^ d — 2) 


(1 -p) 


^Ex[ 


laS'd.Tl - (d - 2 - X)\dTa,k 


]}- 


all spy descendants of fc-th spine node 


(37) 


where X ~ Binom(d — 2, (1 — \Td,k\ = is the number of nodes in each candidate subtree for a pivot at level k, and 

\dTd,k\ = {d — is the number of leaf nodes in each candidate subtree. 

This expression can be evaluated numerically, as shown in Figure which illustrates the tradeoff between the effect of a 
snapshot and spy nodes. The derivation for this expression is included in IIT4l . 


VI. Connections to Polya’s urn processes 

In this section, we make a connection between adaptive diffusion on a line and Polya’s urn processes. In doing so, we 
highlight a property of Polya’s urn processes, which inherently provides privacy. Further, we apply the Bayesian interpretation 
of Polya’s urn processes to design a new implementation of adaptive diffusion and analyze the precise cost of revealing the 
control packets to the spy nodes, in terms of leaked anonymity. 

To separately characterize the price of timestamp metadata and control packets, we focus on the concrete example of a line 
graph. Consider a line graph in which nodes 0 and n +1 are spies. One of the n nodes between the spies is chosen uniformly 
at random as a source, denoted by u* G {1, ■ • ■ ,n}. We let to denote the time the source starts propagating the message 
according to some global reference clock. Let = Ti + to and = T 2 + to denote the timestamps when the two spy 
nodes receive the message, respectively. Knowing the spreading protocol and the metadata, the adversary uses the maximum 
likelihood estimator to optimally estimate the source. 

Standard diffusion. Consider a standard discrete-time random diffusion with a parameter q G (0,1) where each uninfected 
neighbor is infected with probability q. The adversary observes and Tg^. Knowing the value of q, it computes the ML 
estimate Dml = argmax„g[„] (Tg^ —Tg^ |u), which is optimal assuming a uniform prior on v*. Since to is not known, 

the adversary can only use the difference Tg^ —Tg^ = T 1 — T 2 to estimate the source. We can exactly compute the corresponding 
probability of detection; Figure (bottom panel) illustrates that the posterior (and the likelihood) is concentrated around the 
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ML estimate, and the source can only hide among 0{^/n) nodes. The detection probability correspondingly scales as \I^Jn 
(top panel). 




20 40 60 80 too 


number of nodes n 


candidate node v 


Fig. 16: Comparisons of probability of detection as a function of n (top) and the posterior distribution of the source for an 
example with n = 101 and T 2 — T 1 = 25 (bottom). The line with ‘control packet revealed’ uses the Polya’s urn implementation. 

Adaptive diffusion on a line. First, recall the adaptive diffusion (Protocol ill with the choice of adih, t) = (Equation 

Q) on a line illustrated in Figure At f = 0, the message starts at node Tl The source passes the virtual source to node 1, 
so V 2 = 1. The next two timesteps (t = 1,2) are used to restore symmetry about V 2 - At f = 2, the virtual source stays with 
probability a2(2,1) = 1/2. Since the virtual source remained fixed at f = 2, at f = 4 the virtual source stays with probability 
02 ( 4 ,1) = 2/3. The key property is that if the virtual source chooses to remain fixed at the beginning of this random process, 
it is more likely to remain fixed in the future, and vice versa. This is closely related to the well-known concept of Pdlya’s urn 
processes', we make this connection more precise later in this section. 

The protocol keeps the current virtual source with probability where Snivt, v*) denotes the hop distance between 

the source and the virtual source, and passes it otherwise. The control packet therefore contains two pieces of information: 
Snivt, V*) and t. 

Suppose spy nodes only observed timestamps and parent nodes but not control packets. The adversary could then numerically 
compute the ML estimate Dml = arg maxy^^n] ^Ti-T 2 \v (^si —Tg^ It')- We can compute the corresponding detection probability 
exactly. Figure [T^ shows the posterior is close to uniform (top panel) and the probability detection would scale as 1/n (bottom 
panel), which is the best one can hope for. Of course, spies do observe control packets, so they can learn Sh{v*,vt) and 
identify the source with probability 1. We therefore introduce a new adaptive diffusion implementation that is robust to control 
packet information. 

Adaptive diffusion via Polya’s um. The random process governing the virtual source’s propagation under adaptive diffusion 
is identical to a Polya’s urn process ||46]| . We propose the following alternative implementation of adaptive diffusion. At f = 0 
the protocol decides whether to pass the virtual source left (D = £) or right {D = r) with probability half. Let D denote this 
random choice. Then, a latent variable q is drawn from the uniform distribution over [0,1]. Thereafter, at each even time t, 
the virtual source is passed with probability q or kept with probability 1 — q. It follows from the Bayesian interpretation of 
Polya’s urn processes that this process has the same distribution as the adaptive diffusion process. 

Further, in practice, the source could simulate the whole process in advance. The control packet would simply reveal to 
each node how long it should wait before further propagating the message. Under this implementation, spy nodes only observe 



Fig. 17: Spreading on a line. The red node is the message source. Yellow nodes denote nodes that have been, are, or will be 
the center of the infected subtree. 
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timestamps and Tg^, parent nodes, and control packets containing the infection delay for the spy and all its descendants in 
the infection. Given this, the adversary can exactly determine the timing of infection with respect to the start of the infection 
Ti and T 2 , and also the latent variables D and q. A proof of this statement and the following proposition is provided in Section 


Vlll-G The next proposition provides an upper bound on the detection probability for such an adversary. 

Proposition 6.1: When the source is uniformly chosen from n nodes between two spy nodes, the message is spread according 
to adaptive diffusion, and the adversary has a full access to the time stamps, parent nodes, and the control packets that is 
received by the spy nodes, observations Ti,T 2 ,q and D, the adversary can compute the ML estimate: 


vml = 


Ti+2 

2 

Ti+3 

2 


L? 

L? 


Ti-2 


Ti-1 


J , if Ti even and D = i , 
J , if Ti odd and D = i, 


(38) 


1 +, if Ti odd and D = r . 


where Ti is the time since the start of the spread until si receives the message, and q is the hidden parameter of the Polya’s 
urn process, and D is the initial choice of direction for the virtual source. This estimator achieves a detection probability upper 
bounded by 

TTs/E 


^ V* = vml ) < + 


(39) 


Equipped with an estimator, we can also simulate adaptive diffusion on a line. Figure [Th] (top) illustrates that even with access 
to control packets, the adversary achieves probability of detection scaling as Ijs/n - similar to standard diffusion. For a given 
value of Ti, the posterior and the likelihood are concentrated around the ML estimate, and the source can only hide among 
0{y/n) nodes, as shown in the bottom panel for Ti = 58. In the realistic adversarial setting where control packets are revealed 
at spy nodes, adaptive diffusion can only hide as well as standard diffusion over a line. 


VII. Future directions and connections to game theory 

Consider a game-theoretic setting where there are two players, the protocol designer and the adversary. The designer can 
choose any strategy to spread the message from a source v*, as long as the message is passed one hop at a time. The adversary 
can choose any strategy (computationally expensive or not) to compute an estimated source v given a some side information 
on the spread. As a result, the source can either be detected or not. In terms of the payoff, the protocol designer wants to 
minimize the probability of detection and the adversary wants to maximize it. 

In this static game setting, the adaptive diffusion is a (weak) dominant strategy under a certain condition. Consider a 
snapshot-based adversary and a contact network of d-regular tree. The special condition we impose is that we are only allowed 
protocols that infect at most, say, 1 -f (2(d— — d)/{d— 2) nodes. In this setting. Theorem 3.1 implies that adaptive 

diffusion is dominant up to a vanishing additive factor. 

Following our work ll42ll . a game-theoretic formulation of the problem of source obfuscation was recently proposed in HtI . 
The designer is restricted to use deterministic protocols, and the snapshot-based adversary is restricted to use a certain family 
of estimators based on Jordan centers. Under these restrictions, it is shown that there is no “dominant” protocol in Nash 
equilibrium sense, other than the simple (deterministic) diffusion. 

There are several interesting future research directions. First, when infecting more nodes is of priority, a fundamental question 
is whether there is a dominant strategy for a given target infection rate. Adaptive diffusion achieves the fundamental limit of 
P(detection) = 1/Nt until Nt < 1 + (2((i — 2 ') ~ (d — l)(^/2) (see Figure on d-regular trees. It is an 

open question what the fundamental limit is above this threshold, and if there is efficient distributed protocol achieving this 
optimal tradeoff. In particular, if we have to spread every time deterministically to achieve the infection speed of Nt — (d—1)^, 
then the source will be trivially detected as the center of infection. Above the threshold of log — ^T\og{d— 1), A variant 
of adaptive diffusion can achieve the infection rate Q;Tlog(d — 1) with probability of detection (a — l)Tlog(d — 1) for any 


a € [0.5,1]. Hence, all grey triangular region is achievable by adaptive diffusion in Figure 18 


Second, when the same source spreads multiple messages that can be linked, this can be posed as a dynamic game. If the 
adversary observes multiple spreads of infection from a single source, how much does the probability of detection increase as 
a function of the multiplicity of the spread? One possibility is to spread according to adaptive diffusion the hrst time, and use 
exactly the same pattern of spread in the consecutive spread of the following messages from the same source. Hence, from the 
meta-data, there is no more information on who the source is. However, this creates a certain permanent bias in the spread, 
which may be undesirable, depending on the application. 

Next, a set of nodes can collude to spread the exactly same message, but starting from multiple sources simultaneously with 
possible delays. Unless carefully coordinated, such spread from multiple sources can be easily detected 1481 and there is no 
gain in collusion. However, we can consider an alternative strategy of creating a pseudo-source node to make the source hard 
to hnd. At a certain time (possible t = 0), the protocol starts another chain of spread starting from a node far away from the 
infection so far. This can improve the detection probability by a factor of the number of such new infections, at a price of 
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Infection size {log Nt) 



Fig. 18: The fundamental limit of P(detection) > 1/Nt is shown in a solid red line. This is achieved by adaptive diffusion 
until loglNx) < ^T\og{d — 1). Infection size at time T is shown on the x-axis in log-scale and the probability of detection 
on y-axis also in log-scale. 


losing the benefits of social filtering and possibly spamming the users with irrelevant messages. We want to be able to measure 
such a loss in social filtering and characterize the tradeoff. 


VIII. Proofs 


A. Proof of Theorem \3.1\ 

Spreading rate. Under Protocol Gt is a complete [d — l)-ary tree (with the exception that the root has d children) of 
depth T/2 whenever T is even. Whenever T is odd, with probability ad{T, h), Gt is again such a {d — l)-ary tree of depth 
(T + l)/2. With probability 1 — ad{T, h), Gt is made up of two {d — l)-ary trees of depth [T — l)/2 each with their roots 
connected by an edge. Therefore, it follows that when d > 2, Nt is given by 


JVt = ^ 


d-2 

d-2 

2 

d-2 

d-2 


2 

d-2 

d-2 


T = 0, 
T> 1, 
T> 1, 
T> 2, 


T odd, w.p. (1 — a) , 
T odd, w.p. a , 

T even ; 


(40) 


Similarly, when d = 2, Nt can be expressed as follows: 

1, T = 0 

I 'T 

Nt — 


T -f 1, T > 1, T odd, w.p. (1 — a) 
T + 2, T > 1, T odd, w.p. a , 

T + 2, T>2, T even ; 


(41) 


The lower bound on Nt in Equation (|^ follows immediately from the above expressions. 


Probability of detection. For any given infected graph Gt, the virtual source vt cannot have been the source node, since 
the true source always passes the token at timestep f = 1. So ¥{Gt\v = vt) = 0. We claim that for any two nodes that are 
not the virtual source at time T, u,w G Gt, P(G't|m) = P(GtI'R') > 0. This is true iff for any non-virtual-source node v, 
there exists a sequence of virtual sources Vif^Q that evolves according to Protocol with vq = v that results in the observed 
Gt, and for all u,w G Gt \ {vt}, this sequence has the same likelihood. In a tree, a unique path exists between any pair 
of nodes, so we can always find a valid path of virtual sources from a candidate node u G Gt \ {^t} to vt- We claim 
that any such path leads to the formation of the observed Gt- Due to regularity of G and the symmetry in Gt, for even T, 
P(Gt|'u*'^^) = P(GT|'y*'^^) for all € Gt with Sniv^^KvT) = 5 h{v^'^\vt)- Moreover, recall that the ad{t,hys were 

designed to satisfy the distribution in Equation Q. Combining these two observations with the fact that we have {d — 1)^ 
infected nodes h-hops away from the virtual source, we get that for all G Gt \ {vt}, P(Gt|u^^^) = P(GT|r'^^^)- 

For odd T, if the virtual source remains the virtual source, then Gt stays symmetric about vt, in which case the same result 
holds. If the virtual source passes the token, then Gt is perfectly symmetric about the edge connecting vt-i and vt- Since 
both nodes are virtual sources (former and present, respectively) and T > 1, the adversary can infer that neither node was the 
true source. Since the two connected subtrees are symmetric and each node within a subtree has the same likelihood of being 
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the source by construction (Equation Q), we get that for all ^,(i),^(2) e Gt \ {vT,VT-i}, Thus 

at odd timesteps, '¥{vml = v*) > 1/{Nt — 2). 


B. Proof of Proposition \3.2\ 

First, under Protocol [^(adaptive diffusion) with ad(t, h) = 0, Gt is a complete (d— l)-ary tree (with the exception that the 
root has d children) of depth T/2 whenever T is even. Gt is made up of two complete {d— l)-ary trees of depth (T— l)/2 
each with their roots connected by an edge whenever T is odd. Therefore, it follows that Nt is a deterministic function of T 
and is given by 

1 , 

d-2 


Nt — 


d-2 


2 

d-2 > 
2 

d-2 > 


T = 0, 

T > 1, T odd , 
T > 2, T even 


(42) 


The lower bound on Nt in Equation follows immediately from the above expression. 

For any given infected graph Gt, it can be verified that any non-leaf node could not have generated Gt under the Tree 
Protocol. In other words, P(GT|r' non-leaf node) = 0 and v could not have started the rumor. On the other hand, we claim 
that for any two leaf nodes vi,V2 G Gt, we have that P(GT|r’i) = P(GT|r’ 2 ) > 0. This is true because for each leaf node 
V G Gt, there exists a sequence of state values {si^u, S 2 ^u}y_^Q^ that evolves according to the Tree Protocol with Si „ = 1 
and S 2 ^v = 0. Further, the regularity of the underlying graph G ensures that all these sequences are equally likely. Therefore, 
the probability of correct rumor source detection under the maximum likelihood algorithm is given by Pml(T) = I/N^t, 
where Ni^t represents the number of leaf nodes in Gt- It can be also shown that iV; t and Nt are related to each other by 
the following expression 

{d - 2 )Nt + 2 


Nit = 


d-l 


This proves the expression for P(wml = 'i'*) given in ( fTO) !. 

Expected distance. For any v* G G and any T, E[5/r(u*, uml)] is given by 

wml)] = EE P(GT|f*)P(uML = v)6h{v* ,v). 

veG Gt 


(43) 


(44) 


As indicated above, no matter where the rumor starts from, Gt is a (d — l)-ary tree (with the exception that the root has d 
children) of depth T/2 whenever T is even. Moreover, wml = v with probability 1/Ni^t for all v leaf nodes in Gt- Therefore, 
the above equation can be solved exactly to obtain the expression provided in the statement of the proposition. 


C- Proof of Proposition |i.3| 

We upper bound the probability of detection by assuming that the adversary takes a snapshot at every time step after T; the 
adversary can also learn the exact value of T by noting the size of the snapshots in successive time steps. The structure of all 
snapshots after Gt depends deterministically on the binary timeseries of choices to either keep the virtual source token, or to 
pass it, in each time step after T —we refer to this timeseries as K-t- The timeseries K-t, in turn, is random, with values that 
depend probabilistically on only the timestamp (which is known to the adversary), the tree degree (known), and the virtual 
source’s distance from the true source (unknown). Because adaptive diffusion does not allow the virtual source to “backtrack", 
or move closer to the true source over time, the (unique) path from the true source to the virtual source vt at time T cannot 
intersect the path comprised of the virtual sources after time T —call it Vt —except possibly at vt itself. Therefore, let us 
consider the first node in Vt that is not equal to vt', we call it vt'- vt' is necessarily a neighbor of vt- Then let us define 
the largest possible subtree of Gt that is rooted at vt' and does not contain vt', we call this subtree Tt- 

Now, suppose that by observing the timeseries K-t, the adversary could learn the distance between v* and vt exactly (this 
is a worst-case assumption). Let us call that distance L. Then the source is equally likely to be any node rc at a distance of 
L hops from vt, such that w ^ Tr- Therefore, we can upper bound the probability of detection by conditioning on L, and 
counting the number of feasible nodes w- 

We assume for the sake of simplicity that all snapshots are taken at even time steps (including Gt), since snapshots at odd 
time steps do not contribute any additional information, i.e., if the adversary observed Gt at an odd timestep, it could recover 
Gt -1 from the subsequent observed snapshots, which is equivalent to observing the first snapshot at time T — 1. Then 

T/2 

VivML = v*) = Y,^{L = e)¥{vML =v*\L = £) (45) 

e=i 

From the previous argument, we have 

=v*\L = i) = 
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Fig. 19: One realization of the random, irregular-tree branching process. Although each realization of the random process 
yields a labelled graph, the adversary observes Gt and Qt, which are unlabelled. White nodes are uninfected, grey nodes are 
infected. 


instead of \/d{d— 1)^ since the entire subtree of Gt containing Vt is excluded from the set of possible candidate sources. 
Additionally, it is straightforward to compute P(L = i) from the properties of adaptive diffusion: 


P(L = 1 ) 


( d - 2 )( d - 1)^-1 

(d_l)T/2_l 


so the overall probability of detection is 

^ ^ d-1 {d- 1)^/2 - 1 2 ■ 

Note that 


(46) 


d((i-l)'^/2_2 (d _ 1)T/2+1 - 2 

= - d ^:2 - = - d ^2 -■ 

Since ((d — l)'^/2+i _ 2 ) > (d — l)'^/^ fQj. all d > 2 and all even T > 2, it holds that Nt > ■ From this, we can 

conclude that Tl2 < Nt + log^_]^(d/2 — 1). It also holds that for all d > 2, (d — — 1 > so we have 

P(t’ML = 1'*) < ^ (log Nt + log(d/2 - 1)), (47) 

which gives the claim. 


D. Proof of Theorem |3.5| 

We first analyze the probability of detection for any given estimator (see Eq. (|52]l); we then show that the estimator in ( [T^ 
is a MAP estimator, maximizing this probability of detection. Finally, we show that using the MAP estimator in gives the 
probability of detection in Eq. ©■ 

We begin with some definitions. Consider the following random process, in which we fix a source v* and generate a (random) 
labelled tree for each time t and for a given degree distribution D. At time t = 0, G^^ consists of a single node v*, which 
is given a label 1. The source v* draws a degree di from D, and generates di child nodes, labelled in order of creation (i.e., 
2 through di + 1). At the next time step, t = 1, the source picks one of these neighbors uniformly at random to be the new 
virtual source and infects that neighbor. According to Protocol [T] each time a node v is infected, v draws its degree d^ from D, 


then generates d„ — 1 labelled child nodes. So at the end of time t = 1, G^^ contains the source and its uninfected neighbors, 

( 2 ) 

as well as the new virtual source and its uninfected neighbors. An example of Gjj is shown in Eigure 
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(left panel) with 

di = 3 and virtual source at node 3. Grey nodes are infected and white nodes are uninfected neighbors. Note that the node 
labelled 1 is always exactly one hop from a leaf of G^^ for all f > 0; also, nodes infect their neighbors in ascending order of 
their labels. The leaves of G^^ represent the uninfected neighbors of infected leaves in standard adaptive diffusion spreading 
over a given graph. Define ^(t,D) as the set of all labelled trees generated at time t according to this random process. 

At some time T, the adversary observes the snapshot of infected subgraph Gt- Notice that we do not need to generate the 
entire contact network, since Gt is conditionally independent of the rest of the contact network given its one-hop neighbors. 
Flence, the we only need to generate (and consider) the one hop neighbors of Gt at any given T. We use Qt to denote this 
random graph that includes Gt and its one hop neighbors as generated according to the previously explained random process. 
Notice that the adversary only observes Q, which is an unlabelled snapshot of the infection and its one hop neighbors (see 
Eigureright panel). We refer to the leaves of Gt as ‘infected leaves’, denoted by OGt, and the leaves of Qt as ‘uninfected 
leaves’ denoted by OQt- Define 

L{Qt) = {G S QI(t.d) I U{G) = (/t}, 
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Fig. 20: L{Q 2 ) for the snapshot Q 2 illustrated in Figure 19 Boxes (a) and (b) illustrate the two families partitioning L{Q 2 ). 


i.e., the set of all labelled graphs (generated according to the described random process) whose unlabelled representation U{G) 
is equal to the snapshot Qt- Figure 20 illustrates L{Qt) for the graph Q 2 in Figure 19 

We define family C L{Qt) as the set of all labelled graphs whose labeling could have been generated by breadth-first 

labeling of Qt starting at node v G dGx- Here breadth-first labeling is a valid order of traversal for a breadth-first search of 
Qt starting at node v. We restrict ?; to be a valid source for an adaptive diffusion spread—that is, it is an infected leaf in 
BGt- Note that a BFS labeling starting from two different nodes on the unlabelled tree can yield the same labelled graph. In 
Figure |20| boxes (a) and (b) illustrate the two families contained in L{Q 2 ). 

I I /'TIV ('T^'\ 

Let P(C'p „) = P(G'^ G Cg^y) denote the probability that the labelled graph whose snapshot is Q is generated from 
a node v. From the definition of the random process for generating labelled graphs, we get 


nCg^,v)=[ n QiQT,v) |Gg,, 

V'IuGG'T’ / 


(48) 


degrees of G 


virtual sources count of 

isomorphisms 


where P£)((i) is the probability of observing degree d under degree distribution D, and 

'^v^OGt 


Q{Qt,v) = 




is the probability of passing the virtual source from v to the virtual source vt given the structure of Qt, where ^y^y^ is the 
unique path from v to vt in Qt- Eq. ( |48] l holds because for all instances in Gg^^y, the probability of the degrees of the nodes 
and the probability of the path of the virtual source remain the same. 

The probability of observing a given snapshot Qt is precisely P(G^^ G L{Qt))- Notice that Gg^^y partitions L{Qt) in to 
family of labelled trees that are generated from the same source. This give the following decomposition: 


FiG^P G L{Gt)) = Y. nCG^,v), 

VeCgry 


where we define Cg^ as the set of possible candidates of the source that generate distinct labelled trees, i.e. 

Cgj, = {u € Gt I Ggry^y Gg^^y! V v' G v' v} . 


(49) 


(50) 


Notice that this set is not unique, since there can be multiple nodes that represent the same family Gg^^y. We pick one of 
such node v to represent the class of nodes that can generate the same family of labelled trees. We use this v to index these 
families and not to denote any particular node in SGt- 

Consider an estimate of the source v{Qt)- In general, v{Qt) is a random variable, potentially selected from a set of candidates. 
We define detection (D) as the event in which v{Qt) = ui(G^ i.e., the estimator outputs the node that started the random 
process. We can partition the set of candidate nodes 9 Gt, by grouping together those nodes that are indistinguishable to the 
estimator into classes. Precisely, we define a subset of nodes indexed by w G 

X6t,« — ^ I C'St.i’ = ■ (51) 

For a given snapshot, there are as many classes as there are families. In Figure]^ the class associated with family (a) has 
one element—namely, the node labeled ‘1’ in family (a). The class associated with family (b) contains two nodes: the node 
labeled ‘1’ in family (b), and the node labeled ‘5’ in the rightmost graph of family (b), since both nodes give rise to the same 
family. 

We consider, without loss of generality, an estimator that selects a node in a given class with probability P({i(^ 7 ’) G XGt,v)- 
Notice that |xeT.«l denotes the number of (indistinguishable) source candidates in this class. From Eq. (|49ll, the probability 
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of detection given a snapshot is 


nD\QT) = 


' {g^P e l(0t) a d 

FiCP G LiGT)) 

Y: V{Cg^,.)v(D\GP 

zr^ ^ / 


vGCg. 


'^v&Cg^ ^iGgT,v) 

where F{D\G)j ' € Cg^Y = G XQt,v)I\xQt,v\- ’^^e following observation: 


(52) 


(53) 


Lemma 8.1: 


^{Gg^YI\XGT,v\ 

S«GCg„ ^iGgT,v) 


n {dw - 1 ) ■ 


(54) 


‘W^<f){v,VT) 

\{v,Vt} 


(Proof in Section |VIII-D1| | 

Substituting Equation (|54ll into Equation (|5^, we get that 


nDlGr) = 71 

veCg.^ 


ViviGT) G XGt.v) 


n (dw - 1 )' 

W^(f){v ,Vt)\ 

{v,vt} 


Since each term of this summation is bounded by 

]P(^(St) g xgt^v) 


< 


1 


Ivt rt 1) min dyrp J~J 1) 

wG(P{v,vt)\ v^Cg^j. wG4>{v,vt) 

{v,vt] \{ i !,' f ; T } 


and Yv&Cg.^ V{v{Gt) G XGt.v) = h it must hold that 

F(D\Gt) < 


ueCg.j 


n (dw - 1) ■ 




This upper bound on the detection probability is achieved exactly if we choose weight V{v{Gt) G XGt.v) = 1 for the class(es) 
minimizing the product “ 1)’ t-e-, 

v{Gt) = arg min TT - !)■ 
v^dGr 

\{v,vt} 

1) Proof of Lemma [ST/] - We have that 

nGg^,.)= ( n PzjK)) Q{Gt,v) \Gg^,,\ 


\wGGt 


^ virtual sources count of 

isomorphisms 


degrees of G 

where u is a feasible source for the adaptive diffusion process, i.e., a leaf of the infection Gt- 
The proof of the lemma proceeds in four steps: 

1) We first recursively define a function H{Gt,v) that is equal to „|. This function is defined over any balanced, 
undirected tree and node; the tree need not be generated via the previously-described adaptive diffusion branching 
process. In addition to H{Gt,v), we are interested in H{Gt,vt)- 

2) We show that 


= [if ^D{dv) \ H{Gt,vt) 

\vGGt / 




n (dw - 1 ) ■ 


W^(P(v,Vt) 

\{v,Vt} 


(55) 















31 



Fig. 21: A realization of the random labeling process given an unlabeled snapshot. 



Fig. 22: The set Tlg\^ 


for the snapshot and node 


specified in Figure 



3) We show that 


E = n ^oidv) ] H{Qti vt)- 


(56) 






4) We combine steps (2) and (3) to show the result. 

Step 1 We wish to define H{Qt, v) —a function that counts the number of distinct, isomorphic graphs generated by a breadth- 
first search of a balanced tree Qt, rooted at node v. Consider a random process defined as follows. Given Qt and root node v, 
the process starts at v and labels it 1. For each neighbor w of node 1, the process randomly orders w’s unlabelled neighbors, 
and labels them in order of traversal. The process proceeds to label nodes in a breadth-first fashion, traversing each node’s 
unlabelled neighbors in a randomly-selected order, until all nodes have been visited. Let Rg^ ^ denote a labelled tree generated 
according to the described random process (see Figure [2T] ). 

The function H{Qt, v) counts the number of distinct graphs that can result from this random process over Qt when starting 

(T) (T) 

from node v. More precisely, define RgJ y as the set of all possible trees RgJ y generated according to this random labeling. 
H{Qt,v) is defined as the size of TZ\ 

HigT,v)=3. 

Recall that Qr is a balanced tree. The Jordan center of this tree is denoted by vt- If Qt was generated according to adaptive 
diffusion, vt would be the virtual source at time T. Although we say Qt is rooted at v, we define each node’s children with 
respect to vt- That is, node z is among w’s children if z is a neighbor of w and z ^ (j){w,VT)- 


Z\v Figure 
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illustrates R-^g^y for Qt and v shown in Figure |2l| In that example. 




Let Qrj 


denote the subtree of Qt rooted at node Vj with node Vi as parent of Vj (let Qj 


= Qt)- Each node Vi in 


Qt will have some number of child subtrees. Some of these subtrees may be identical (i.e., given a realization Rg^-^y of the 
labeling random process, they would be isomorphic); let denote the number of distinguishable subtrees of node v. We use 
,..., to denote the number of each distinct subtree appearing among the child subtrees of node v (recall children are 
defined with respect to vt)- For example, node v in graph Qt in Figure (left panel) has A" = 1 and A2 = 2, since the first 
of u’s child subtrees is equal only to itself, and the second (middle) subtree is isomorphic to the subtree on the right. If there 
exists a neighboring, unvisited subtree rooted at a parent of v, then we say Ag = 1 (by definition, there will only be one such 
subtree, and it cannot be equal to any child subtrees because Gt is balanced). Otherwise, we say Ag = 0. This distinction 
becomes relevant if u ^ vt- For example the figure below shows a tree that is rooted at w ^ vt- In computing H{Qt,w), 
we have A^ = 1 because there is an unvisited branch from w that contains vt, and A^" = 2 because both child subtrees of 
w are identical. 



Let 7” denote the unvisited neighbors of node v in Qt- We give a recursive expression for computing H{Qt,v)- 
Lemma 8.2: 

H{Qt,v) = 

Proof: We show this by induction on the depth A of Qt (rooted at v). For X = 1, Qt has a node v and dy neighbors. Every 
realization of the random breadth-first labeling of Qt will yield an identical graph since the neighbors of v are indistinguishable, 
so H{Qt,v) = (^”) = 1. 


AV Av 

AAq, ^ 1, . . . 


,a; 


n HiQf-^^,w). 


(57) 
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Now suppose Equation ( |57] l holds for all graph-node pairs {Gt, v) with A < Ao; we want to show that it holds for A = Aq. 
We can represent Qt as a root node v and dy subtrees; for w € 7 ’'. Since each subtree has depth at most Ao — 1, we 

can compute for each subtree using equation 57 (from the inductive hypothesis). 

Suppose we impose (any) valid labeling on Qt starting from v, we refer to the labeled graph as Given Rg^y, we 

order the subtrees of a node in ascending order of their numeric labels. For any fixed ordering of the dy subtrees of v, we 
have nonidentical labelings of Qt that respect the ordering of subtrees and are isomorphic to any given 

realization Rg^^^y. At most, there can be dyl arrangements of the subtrees. However, some of the subtrees are isomorphic, 
so this value over-counts the number of distinct arrangements. That is, switching the order of two nonidentical, isomorphic 
subtrees is the same as preserving the order and changing both subtrees to the appropriate nonidentical, isomorphic subtree; 
this is already accounted for in the product ntuG 7 ” H{G'^'^^w). AJ! of the dy \ permutations of u’s subtrees permute the jth 
unique subtree with isomorphisms of itself. As such, the non-redundant number of different arrangements of the subtrees of 


node V is 


d„! 




). This gives the expression in Equation ( |57l i. 


Step 2. We want to show that 


ncg^,v) = n 


\v^Gt 


H{Qt:Vt)\XGtA 
^vt n i^w 1) 

W^(f){v,VT) 

\{v,vt] 

Since f‘(Cg^^y) = (IIugGt ^D^dy)) Q{Qt,v)H{Qt,v), this is equivalent to showing that 

H{Qt,v) ^ _ 

H{Qt,vt) Q{QT,v)dy^ n {dyj — 1) 

\{v,vt} 

dy I I 

A IXPt.i’I- 


The expressions for H{Qt,vt) and H(Qt,v) differ in that the former starts at the virtual source and counts all subtrees by 
“trickling down" the tree (i.e., = 0 for all w G Gt), whereas the latter progresses from an infected leaf v to the virtual 

source, then recurses over the remaining, unvisited subtrees of vt- Let Pi denote the Ah node in the path from v to vt, which 
has length We get 

dp2 - 1 

l,Af^-l,...,A^^^ 


ll H{QP-^^,w)x 

uGT^ni-Pi.Pa} 


^Pe-: 


- 1 


n 




1 A^^-i I -li ^w)x 

LA, 

\{Pf_2,PH 

AA _ / ) n 




where each line corresponds to the terms that result from recursively moving up the path from v = Pi to vt = Pi- Similarly, 
we have 

^P 2 "" ^P 2 ) n ,w)x 

!’■■■’ W^^Pl\{Px,P3} 


dpe_i — 1 


Af^-^.Afi-^ 


Pl-1 


1 ’ ■ • • ’ 


dpi 




n ,w)x 


£-1 


Pi-1 


\{p^_ 2 .pa 




Pe we-y^t\{Pi-i} 
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Here we have expanded the expression in terms of the path from v to vt to make simplification clearer, where v is the node 
over which we previously computed H{Qt,v). Computing the ratio of H{Qt,v) to H{Qt,vt), all the rightmost products of 
each line cancel. We are left with the ratio of the combinatorial expressions, which simplify to 


H{Gt, v) 
H{Qt,Vt) 


dp. 


^A 5 ’+\..A^ 


Vt — 1 a Vt 


A^T^ 


(58) 


Each Ai denotes the number of child subtrees that are identical to the one containing v, for a given root. As such, the 
product of As above is precisely the number of candidates in the class being considered, or That is, since they are 

indistinguishable in the unlabelled graph, they generate the same family 


Step 3. We have 


E ncG^,v) = 

veCg^ 


E (n PD(duj) j H{Gt,Vt) 


Kw^G'i 




( n ^D{dw)jH{GT,VT)>< 


\wGGt 




^ _ 

v&Cgrp nuje</>(i;,UT)\{i','!'T} ~ 

( n ^D{dw)jHiGTTVT)x 

\wGGt / 

^ ^_ 


(59) 


where (59i follows because every leaf in the graph is a candidate source in exactly one class. We wish to show this last 
summation sums to 1. Consider a random process over Gp- The process starts at the virtual source vp, and in each timestep 
it moves one hop away from vp. It chooses among the (unvisited) children of a node uniformly at random. At time T, 
the process is necessarily at one of the leaves of Gp, and the probability of landing at a particular leaf v is precisely 
^jy. Therefore, the sum of this quantity over all leaves v G dGp is 1. 


■ n. 


G 4 >(v,vj-)\{v,vt} 


(d„ 


Step 4. Combining the results from steps 3 and 4, we get that 

^{CGT,v)l\XgT,v\ ^ 

(nu,gGT^^(^^)) ^ \XQT,v\l\XgT,v\ 

(iIujggt n {dw-i) 

'w^(I){v,vt) 

\{v,Vt} 

1 

dyrp {dyy l) 

If G 0('i’, Vt ) \ { 5 } 


E. Proof of Theorem \3.7\ 

To facilitate the analysis, we consider an alternative random process that generates unlabeled graphs Gf accor ding to the 


3.5 1 . For a 


same distribution as Gp (i.e., the infected, unlabeled subgraph embedded in U{G^^'^) from the proof of Theorem 
given degree distribution D and a stopping time T, the new process is defined as a Gabon-Watson process in which the set of 
offsprings at the first time step is drawn from D and the offsprings at subsequent time steps are drawn from D — 1. At time 
f = 0, a given root node vp draws its degree dy^ from D, and generates dyj, child nodes. The resulting tree now has depth 
1. In each subsequent time step, the process traverses each leaf v of the tree, draws its degree from D, and generates dy — 1 
children. The random process continues until the tree has depth T/2, since under adaptive diffusion, the infected subgraph 
at even time T has depth T/2. Because the probability of detection in Equation © does not depend on the degrees of the 
leaves of Gp, the random process stops at depth T/2 rather than T/2 -f 1. We call the output of this random process Gf. 
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Fig. 23: Pruning of a snapshot. In this example, the distribution D allows nodes to have degree 2 or 3, so we prune all 
descendants of nodes with degree 3 that are more than clog(fo) hops from the root. In this example, pi(/i — 1) < 1 and the 
pruned random process eventually goes extinct. 


The distribution of G'j. is identical to the distribution as the previous random process imposed on Gt, which follows from 
Equation ( |5^ in the proof of Theorem 3.5 We therefore use Gt to denote the resulting output in the remainder of this proof. 

Distribution D is a multinomial distribution with support / = (/i, • ■ • > /ij) and probabilities p = {pi,... ,prj). Without loss 
of generality, we assume 2 < /i < ... < /^. Let pr) denote the mean number of children generated by D: 


V 

h-D = - !)■ 

i=l 


There are two separate classes of distributions, which we deal with as separate cases. 


Case 1: When pi{fi — 1) > 1, we claim that with high probability, there exists a leaf node v in dGx such that on the unique 
path from the root vt to this leaf v, all nodes in this path have the minimum degree /i, except for a vanishing fraction. To 
prove this claim, consider a different graph Ht derived from Gt by pruning large degree nodes: 

1) For a hxed, positive c, hnd to such that T/2 = to + clog(fo)- 

2 ) Initialize Ht to be identical to Gt- 

3) For each node v G Ht, if the hop distance 6h{v,vt) < clog(fo), do not modify that node. 

4) For each node v G Ht, if the hop distance 5h{v,vt) > clog(fo) and dy > fi, prune out all the children of v, as well 
as all their descendants (Figure [23] ). 

We claim that this pruned process survives with high probability. The branching process that generates Ht is equivalent 
to a Gabon-Watson process that uses distribution D — 1 for the hrst clog(fo) generations, and a different degree distribution 
D' — 1 for the remaining generations; D' has support /' = (/i, 1), probability mass p' = (pi, 1 — pi), and mean number of 
children po’ = Pi{fi - !)■ 

Note that /i > 3 by the assumption that pi(/i — 1) > 1. Hence, the inner branching process up to clogfo has probability 
of extinction equal to 0. This means that at a hop distance of to from vt, there are at least (/i — nodes. Each 

of these nodes can be thought of as the source of an independent Gabon-Watson branching process with degree distribution 
D' — 1. By the properties of Gabon-Watson branching processes (M, Thm. 6.1), since pT>i > 1 by assumption, each 
independent branching process’ asymptotic probability of extinction is the unique solution of gn'is) = s, for s G [0,1), where 
gD'{s) = Pi -I- (1 — pi) denotes the probability generating function of the distribution D'. Call this solution Oo'■ The 
probability of any individual Gabon-Watson process going extinct in the hrst generation is exactly 1 — pi. It is straightforward 
to show that go'is) is convex, and p£)/(l—pi) > 1 —pi, which implies that the probability of extinction is nondecreasing over 
successive generations and upper bounded by Od'- Then for the branching process that generates Ht, the overall probability 
of extinction (for a given time T) is at most 0)^; ' . Increasing the constant c therefore decreases the probability of 

extinction. If there exists at least one leaf at depth T (i.e., extinction did not occur), then there exists at least one path in Ht 
of length to — clogfo in which every node (except possibly the hnal one) has the minimum degree /i. This gives 


log(Ag^) 

r/2 


< 

< 


to^Ogjfl - 1) + clog(fo)log(/,, - 1) 
to + c\og{to) 

1 /r It , clogto , /r, - 1 

l 0 g(/i -1) + —-log t /—p , 

to t 


(60) 

( 61 ) 


f f, _1 '\C log *0 

with probability at least 1 - 9)^} ’ 


= 1-0 


clog(/i-l) 
^0 
D' 


= 1 — e where Gd' = log( 0 £i/) and the upper bound in 

comes from assuming all the interior nodes have maximum degree /^. Since Ht is a subgraph of a valid snapshot Gt, 
there exists a path in Gt from the vbtual source vt to a leaf of the tree where the hop distance of the path is exactly T/2, 
and at least to nodes have the minimum degree /i. Since the second term in ( |M] l is o{to), the claim follows. The lower bound 
log(A//j,)/(T/2) > log(/i —1) holds by dehnition. Therefore, for any 5 > 0, by setting T (and consequently, to) large enough. 
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Fig. 24; Pruning of a snapshot using multiple types. In this example, the distribution D allows nodes to have degree 2 or 3. 
We take to = 2 and r — 0.5, so all descendants of nodes with type rto = 1 are pruned. 


we can make the second term in ( |6T] l arbitrarily small. Thus, for T > C'jj g, 
the degree distribution and 6, the result holds. 


where C'jj ^ is a constant that depends only on 


Case 2: Consider the case when pi{fi — 1) < 1. By the properties of Gabon-Watson branching processes ( ll49ll . Thm. 6.1), 
the previous pruned random process that generated graphs Ht goes extinct with probability approaching 1. This implies that 
with high probability there is no path from the root to a leaf that consists of only minimum degree nodes. 

Instead, we introduce a Galton-Watson process with multiple types, derived from the original process. Our approach is to 
assign a numeric type to each node in Gt according to the number of non-minimum-degree nodes in the unique path between 
that node and the virtual source. If a node’s path to vt contains too many nodes of high degree, then we prune the node’s 
descendants. The challenge is to choose the smallest pruning threshold that still ensures the pruned tree will survive with high 
probability. Knowing this threshold allows us to precisely characterize Aqt for most of the instances. 

To simplify the discussion, we start by considering a special case in which D allows nodes to take only two values of 
degrees, i.e., rj = 2. We subsequently extend the results for p = 2 to larger, finite values of rj. With a slight abuse of a notation, 
consider a new random process Ht derived from Gt by pruning large degree nodes in the following way; 

1) For a fixed, positive c, find tg such that T/2 = to + clog(fo). 

2 ) Initialize Ht to be identical to Gt- 

3) For each node v G Ht, if the hop distance 6h{v,vt) < clog(fo), do not modify that node, and assign it type 0. 

4) For each node v G Ht, if the hop distance 6h{v,vt) > clog(fo), assign v a type which is the number of nodes in 
4>{w, v) \ {u} that have the maximum possible degree / 2 , where w is the closest node in Ht to v such that Sniw, vt) < 


clog(fo) (Figure [24|). 

5) Given a threshold r G (0,1), if a node v has type > rto, prune out all the descendants of v. For example, in Figure 


24 if fg = 2 and the threshold is r = 0.5, we would prune out all descendants of nodes with > 1. 


We show that for an appropriately-chosen threshold r, this pruned tree survives with high probability. By choosing the 
smallest possible r, we ensure that Ahj, consists (in all but a vanishing fraction of nodes) of a fraction r nodes with maximum 
degree, and (1 — r) of minimum degree. This allows us to derive the bounds on log(Ajjy)/(T/2) stated in the claim, which 
hold with high probability. 

Let k = rto- The process that generates Ht is equivalent to a different random branching process that generates nodes 
in the following manner; set the root’s type = 0. At time f = 0, the root vt draws a number of children according to 
distribution D, and generates dy,j. children, all type 0. Each leaf generates type 0 children according to child degree distribution 
D — \ until clog(<g) generations have passed. At that point, each leaf v in this branching process (which necessarily has type 
0 ) reproduces as follows; if its type then V does not reproduce. Otherwise, it either generates (/i — 1) children with 

probability pi, each with state or it generates (/2 — 1) children with probability p 2 , each with state -f 1. This continues 
for to generations. Mimicking the notation from Case 1, we use D' to denote the distribution that gives rise to this modified, 
multi-type random process (in the final fg generations); this is a slight abuse of notation since the branching dynamics are 
multi-type, not defined by realizations of i.i.d. degree random variables. 

Lemma 8.3: Consider a Galton-Watson branching process with child degree distribution D — 1, where each node has at least 
one child with probability 1, and po-i > 1- Then the number of leaves in generation t, Z^*‘\ satisfies the following; 


Zd) > gCft 


with probability at least 1 — where both Gt and are constants that depend on the degree distribution. 


(Proof in Section |VIII-E1| | 

The first clog(fg) generations ensure that with high probability, we have at least 
Watson processes originating from the leaves of the inner subgraph; this follows from Lemma 8.3 Here we have encapsulated 
the constant c from the first clog(fg) generations in the constant Gi- Eor example, in Figure 24 there are 3 independent 


independent multi-type Gabon- 
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Gallon-Watson processes starting at the leaves of the inner subgraph. We wish to choose r such that the expected number 
of new leaves generated by each of these processes, at each time step, is large enough to ensure that extinction occurs with 
probability less than one. For brevity, let a = Pi{fi — 1) and let /? = P 2(/2 ~ !)■ Let denote the {k -I- 1)-dimensional 
vector of the expected number of leaves generated with each type from 0 to fc in generation t. This vector evolves according 
to the following (/c -f 1) x {k + 1) transition matrix M: 




a P 


a P 

0 


M 


The last row of M is 0 because a node with type k does not reproduce. Since the root of each process always has type 0, we 
have x^^^ = ei, where ei is the indicator vector with a 1 at index 1 and zeros elsewhere. 

Let denote the expected number of new leaves created in generation t. This gives 

E[ZW] = (62) 

where t denotes a transpose, and 1(^+1) is the (k + 1) all-ones vector. When t < k, this is a simple binomial expansion of 
(a -I- py. For t > k, this is a truncated expansion up to k: 


i =0 



t-ipi_ 


(63) 


We seek the necessary and sufficient condition on r for non-extinction, such that (1/f) log(E[Z(*)]) > 0. Consider a binomial 
random variable W with parameter P/(a + P) = P/pn and t trials. Equation ( [ 6 ^ implies that for large t, 

= {a + PYV{W<k). (64) 

= exp|+o(f)| , (65) 

by Sanov’s theorem IfSOl . We wish to identify the smallest r for which (1/f) log(E[Z*'*^]) is bounded away from zero. Such 
an r is a sufficient (and necessary) condition for the multi-type Gallon-Watson process to have a probability of extinction less 
than 1. To achieve this, we define the following set of r such that Eq. ( |65] l is strictly positive, for some e > 0: 

T^a,i3ie) = {?'| > DKh{r\\P/fJ.D) + e} , (66) 

Suppose we now choose a threshold r G Tla,p{e)- This is the regime where the modified Gallon-Watson process with 
threshold r has a chance for survival. In other words, the probability of extinction 9oi is strictly less than one. Precisely, 
9d' is the unique solution to s = gops), where gaps) denotes the probability generating function of the described multi¬ 
type Gallon-Watson process. Using the same argument as in Case 1, we can construct a process where the probability of 
extinction is asymptotically zero. Precisely, we modify the pruning process such that we do not prune any leaves in the first 
clog(fo) generations. This ensures that with high probability, there are at least i°g(‘o) independent multi-type Gallon-Watson 
processes evolving concurrently after time clog(fo), each with probability of extinction 9^1. Hence with probability at least 
1 — (for an appropriate choice of a constant Cd' that only depends on the degree distribution D' and the choice of 

r), the overall process does not go extinct. 

Our goal is to find the choice of r with minimum product of degrees log(AG.j,)/(r/2) that survives. We define ri as follows: 

ri = argmin (1 - r)log(l - /i) -f rlog(l - / 2 ). 

reRa,p{f-) 


Since is just an interval and we are minimizing a linear function with a positive slope, the optimal solution is 

ri = inf^gTj^ r. This is a choice that survives with high probability and has the minimum product of degrees. Precisely, 
with probability at least 1 — where Cd' depends on D' and e, we have that 


log( Agt) 

Tj2 


< inj) + 


clog(fo) 

^0 


log (/2 


1 ) 


where with a slight abuse of notation, we define (ri, /) = (! — rp log(/i — 1) -f ri log (/2 — 1). It follows that 

log(AGr) 


T/2 
(ri - r*) log 


-{rPf)< 

/ 2 -l\ , Clog(fo) 


/l-l 


+ -7-log (/2 - 1) 

to 


( 67 ) 
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By setting e small enough and to large enough, we can make this as small as we want. For any given (5 > 0, there exists a 
positive e > 0 such that the first term is bounded by 5/2. Further, recall that T12 = clog(fo) + to. For any choice of e, there 
exists a tu'.f. such that for all T > ^ the vanishing term in Eq. ( |65] l is smaller than e. For any given 5 > 0, there exists 

a positive to '.5 such that T > to',s implies that the second term is upper bounded by 5/2. Putting everything together (and 
setting e small enough for the target 5), we get that 

for all T > C'jj, g, where Cd ',5 and C'j^, g are positive constants that only depend on the degree distribution D' and the choice 
of 5 > 0. 

For the lower bound, we define the following set of r such that Eq. ( |65l ) is strictly negative: 

log(AtD) <-Dkl(7-||/3/aid) - e} ■ (69) 

Choosing r G causes extinction with probability approaching 1. Explicitly, ^ 0) is the probability of non¬ 
extinction at time t, and ^ 0) < By Equation ( |65| l, we have 

< gH^°s{fin)-DKi.{r\\P/fin)+o{t)) 

where logipu) — E1kl(^'||/^/md) < The probability of extinction is therefore at least 1 — E[Z(*)] > 1 — So 

defining 

ra = argmax (1 - r) log(l - fi) + r log(l - /a), 

relics,pie) 


we have 


with probability at least 1 — e 


where Cd' ,2 is again a constant that depends on D' and e. It again follows that 


log(AG^) 


T/2 
(ra-r*)log 




/ 2-1 

/l-l 


^iog(^o) 

to 


log (/i - 1) , 


(70) 


where ra — r* is strictly negative. Again, for any given 5 > 0, there exists a positive e > 0 such that the first term is lower 
bounded by —5/2, and for any choice of e, there exists a tjj'.e such that for all T > the vanishing term in Eq. (|6g 
is smaller than e. Note that this e might be different from the one used to show the upper bound. We ultimately choose the 
smaller of the two e values. Eor any given <5 > 0, there exists a positive to'.s such that T > to'.s implies that the second term 
is lower bounded by —5/2. Putting everything together (and setting e small enough for the target 5), we get that 


/ log(AG^) 
V T/2 


< {r*,f) - < 5 ) < e 


(71) 


for all T > C/,, g, where Cd'.s and C'j^, g are positive constants that only depend on the degree distribution D' and the choice 
of i5 > 0. This gives the desired result. 


We now address the general case for D with support greater than two. We follow the identical structure of the argument. 
The first major difference is that node types are no longer scalar, but tuples. Each node u’s type is the ( 77 — l)-tuple listing 
how many nodes in the path v) \ {u} had each non-minimum degree from /2 to /^, where w is the closest node to v 
such that 5h{w,vt) < clog(fo)- Consequently, the threshold r = [ri,...,r^_i] is no longer a scalar, but a vector-valued, 
pointwise threshold on each element of ^y. We let k = [k^ = r^to,..., denote the time-dependent threshold, 

and we say fc < if fci < for 1 < t < ?7 — 1. The matrix M is no longer second-order, but a tensor. Equation ( | 6 ^ 
still holds, except M is replaced with its tensor representation. Eor brevity, let a = pi(fi — 1) and /3i = — 1). Let 

^ Pi- Hence, Equation (| 6 ^ gets modified as 




fcl 

E- 

2 l =0 


k-q — X 

■ E 

^ 77—1 —0 




^77—1 


, Vjj 


a 


• ■ • Pri-l • 


(72) 


Now we consider a multinomial variable W with parameters pi/po for 1 < t < p — 1 and t trials. Note that a/po is the 
‘failure’ probability (corresponding to a node of degree /i); such events do not contribute to the category count, so the sum 
of parameters is strictly less than 1. As before, equation ^T2\ can equivalently be written as 

E[Z^*'>] = p*DV{W<k) 

P 


= Me exp|-f£>KL(r II ) -hi 


(73) 
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where (3/^,0 denotes elementwise division. Once again, we wish to obtain bounds on f‘{W < k). As before, we define the 
following set of r such that Eq. ( |7^ is strictly positive, for some e > 0: 

T^a,p{e) = {r| log(/r£,) > i:)KL(r'|| f—)) + e} , (74) 

We now choose a threshold r € TZa^p{e). Using the same argument as before, we can construct a process where the probability 
of extinction is asymptotically zero. We again do not prune any leaves in the first clog(fo) generations. This ensures that with 
high probability, there are at least independent multi-type Gabon-Watson processes evolving concurrently after time 

clog(fo), each with probability of extinction 9o'- Hence with probability at least 1 — (for an appropriate choice of 

a constant Cd' that only depends on the degree distribution D' and the choice of r), the overall process does not go extinct. 

We define ri analogously to the p = 2 case: 

ri = argmin (r,/), 
reTZcpie) 


where we now define (r,/) = (1 — ^^ r^) log(/i — 1)log(/j+i — 1). Therefore with probability at least 1— e 
where Cd' depends on D' and e, we have that 

It follows that 


r}-l 


i=i 


fj+i - 1 
/i-l 


:log(fo) 

to 


log (/^ - 1 ) . 


(75) 


By setting e small enough and to large enough, we can make this as small as we want. For any given (5 > 0, there exists a 
positive e > 0 such that each term in the summation in (|7^ is bounded by S/rj. Further, recall that T12 = clog(fo) + to- For 
any choice of e, there exists a such that for all T > to'^e the vanishing term in Eq. (|6^ is smaller than e. For any given 
(5 > 0, there exists a positive to' ,5 such that T > to'^s implies that the second term of ( |75] l is upper bounded by S/rj. Putting 
everything together (and setting e small enough for the target 5), we get that 

(76) 

for all T > C'jj, g, where Cd ’,5 and C'j^, g are positive constants that only depend on the degree distribution D' and the choice 
of 5 > 0. 

For the lower bound, we again define a set of r such that Eq. ( [65] ) is strictly negative: 

T^a.pie) = {r| ^ogifio) < DKhirW (—'])-e} . (77) 

Kf-D/ 

Choosing r G causes extinction with probability approaching 1. Explicitly, P(Z^*^ ^ 0) is the probability of non¬ 
extinction at time t, and ^ 0) < By Equation ( |65| l, we have 

< gt{\og{^in)-Diii,{r\\l3/no)+o{t)) 

where log(^D) — I?KL(’'|l/3/Mr3) < —£■ The probability of extinction is therefore at least 1 — > 1 — So 

defining 

r 2 = argmax (r, /) , 

reTZa,,i3(e) 


we have 




T/2 to 

with probability at least 1 — where Cd ',2 is again a constant that depends on D' and e. It follows that 

log(AGT) 




T/2 

i=i 


fj + l - 1 \ clogfa) 

fl-lj to 


log (/^ - 1 ) . 


(78) 
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where {r 2 )j — r* is strictly negative. Again, for any given 5 > 0, there exists a positive e > 0 such that each term in the 
summation in ( |78| ) is lower bounded by —S/rj, and for any choice of e, there exists a tD',e such that for all T > f/j'.e the 
vanishing term in Eq. ( |65l l is smaller than e. We again choose the smaller of the two e values from the upper and lower bound. 
For any given 6 > 0, there exists a positive tD',s such that T > to 'implies that the second term is lower bounded by —S/rj. 
Putting everything together (and setting e small enough for the target S), we get that 


»/ log(AGT 
V T/2 


< {r*,f) - (j) < e 


(79) 


for all T > C'jj, g, where Cd/^s and C'j-,, g are positive constants that only depend on the degree distribution D' and the choice 
of i5 > 0. This gives the desired result. 

1) Proof of Lemma 8.3 If /i > 2, then the claim follows directly, because each leaf generates at least 2 children in each 
generation. 

If fi = 2, then for parameters p > 0 and A > 0, we use the Markov inequality to get 


<p) < E[e“ 

(*) ( - 
= fe-i(e 




where go-iis) = is the probability generating function of Z? — 1, and PpLi(s) is the f-fold composition of this 

function. The goal is to choose parameters p and A such that this quantity approaches zero exponentially fast. The challenge 
is understanding how behaves for a given choice of A. 

Figure pSjillustrates gD-i{s). Because each node always has at least one child, the probability of extinction for this branching 
process is 0. As such, the probability generating function is convex, with pD_i(0) = 0 and pD_i(l) = 1. This implies that for 
any starting point e“^, the fixed-point iteration method approaches 0. We characterize the rate at which (7^Li(so) approaches 
0 by separately bounding the rate of convergence in three different regions of s (Figure [25] ). First, we choose a starting point 
So = e~^. We pick any value si < 1, such that the slope is strictly larger than one, i.e. g'jj_i{si) > 1. There may be multiple 
points that satisfy this property; we can choose any one of them, since it only changes the constant factor in the exponent. 
Without loss of generality, we assume that sq > Si, since otherwise we can start the analysis from the region III. Then region 
I consists of all s S [si, sq]. To define S 2 , we draw a line segment parallel to the diagonal from Si. The intersection is defined 
(s 2 , Pd-i(s 2 ))- Region II consists of all s G [s 2 , Si). Finally, we choose a threshold e, below which we say the process has 
converged. Then region III consists of all s G [e, S 2 ). We wish to identify a time t that guarantees, for a given e and A, that 
gD-iie~^) < e. 


( 0 , 



( 1 , 1 ) 


Fig. 25; Regions of the probability generating function, in which we bound the rate of convergence. 

To begin, we split the time spent in each region into ti, t 2 , and with + t 2 + fa = t. We first characterize ti. Note 
that gD-i{so) < 1 — ( 7 £ 3 _i(si)(l — sq) for so in region I. This holds because si has the lowest slope of all points in region 
I. Applying this recursively, we get that 

9d-i{s) < max{l- s), 5 d_i(si)} 
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for all s in region 1. In region 11, we instead upper bound p£)_i(s) by the line segment joining gjy-i(si) and pD_i(s 2 ). This 
line has slope 1 , giving 

^ inax{gn-i(si) - (si - gD-i(si))t2,gD-i(s2)} ■ 

In region III, we upper bound g£>-i(s) by the line y(s) = g'/y_i(s 2 )s. We have that gjy-i(s) < g'jy_i(s 2 ) ■ s for s in region 
III. Recursing this relation gives 

3 d - i ( s ) < niax{p^_i(s 2 )‘" • s,e} • 

Thus, if f > 3 max{fi,f 2 ,i 3 }, then < e. In particular, we choose 

. log((l - 5 r,_i(si))/(l - e"^)) 


t > 3 max { 


log(5i,_i(si)) 
gD-i{si) - gD-i{s 2 ) log(e) 


;}■ 


Si-gD-l{si) ’ S2log(5i3_i(s2)) ■ 

So for sufficiently large t, we have < p) < e • By choosing 

we ensure that the third bound on t is always true, and the other two are constant. Similarly, we choose 


(80) 


e"^ = 1 - 


1 - S2 


giving 


/ 


nz^*'><p) < S2-5b_i(s2)‘/' 


\ 


-p 


1 - 


1 - S2 




V 


B 




= S2-g'D-i{s2Y'Hl-B)-^‘ 

Choosing p = pi,_i(si)‘/^/(l — S 2 ), we observe that for t larger than the bound in ( |80l l, the number of leaves is lower bounded 
by an exponentially growing quantity (p) with probability approaching 1 exponentially fast in t. 


F. Proof of Proposition \3.9\ 

Number of nodes. T is either even or odd. At each even T, Gt is a ball (defined over a grid graph) centered at the virtual 
source with radius T/2; that is, Gt consists of all nodes whose distance from the virtual source is at most T/2 hops. Thus 
at each successive even T, Gt increases in radius by one. The perimeter of such a ball (over a two-dimensional grid) is 4^. 
The total number of nodes is therefore 1 + 4* = + 2T -f 2). 

When T is odd, there are two cases. Either the virtual source did not move, in which case Nt = Nt+i (because all 
the spreading occurs in one time step), or the virtual source did move, so spreading occurs over two timesteps. In the latter 
case, the odd timestep adds a number of nodes that is at least half plus one of the previous timestep’s perimeter nodes: 
Nt > Nt -1 + 2"^^ + 1 = 5 ( 2 ^^ + 2T + 1). This is the smaller of the two expressions, so we have Nt > {T + 1)^/2. 
Probability of detection. At each even T, Gt is symmetric about the virtual source. We reiterate that the snapshot adversary 
can only see which nodes are infected—it has no information about who infected whom. 

In order to ensure that each node is equally likely to be the source, we want the distribution of the (strictly positive) distance 
from the virtual source to the true source to match exactly the distribution of nodes at each viable distance from the virtual 
source: 




tG + 1 ) 


1 

2 

t/2 


(81) 
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Protocol 5 Grid adaptive diffusion 

Input: grid contact network G = {V,E), source v*, time T 
Output; set of infected nodes Vt 
1: Vq ^ {t;*}, h ^ 0, Vq ^ V* 

2: K, ^ {N, S,E,W} > Cardinality directions 

3: let ky{u) denote u’s direction with respect to v 
4: V* selects one of its neighbors u at random 
5: Vi 3— Vb U {u}, VI <— u 
6: = lfk^(u)=E} — l{fe„(«)=ty} 

7: = l{fe^(u)=Ar} — l{fe„(«)=S} 

8: let (u) represent u’s neighbors in directions K C 1C 
9: Vb 3— Vi U 7V^(u) \ {u*}, V2 3— Vi 
10: f ^ 3 
11: for t < T do 

12: Vt-i selects a random variable X ^ U{0, 1) 

13: if X < a{t — 1, \h^\ + \h^\) then 

14: for all V G N{vt-i) do 

15: Infection Message(G,z;t_i,t;,{A:«(t;t_i)}, Gt) 

16: else 

17: it: 3- 0 

18: if < 0 then 

19: K^KU{E} 

20 : else if > 0 then 

21: K^KU{W} 

22: if < 0 then 

23: K^KU{N} 

24: else if > 0 then 

25: K^KU{S} 

26: Vt-i randomly selects u € 

27: + l{lc„ («)=£;}— 

28: + l{fe„(«)=Ar} - l{fc„(«)=S} 

29: Vt U 

30: for all u e do 

31: Infection Message{G,Vt,v,{kv^{vt-i), kv{vt)},Vt) 

32: if f + 1 > T then 

33: break 

34: Infection Message(G,Vt,v,{kv^{vt-i), kv{vt)},Vt) 

35: t i — f -f 2 

36: procedure Infection MESSAGE(G,M,z;,it:,yt) 

37: if u e Vt then 

38: for all w G do 

39: Infection Message(G,u,r(;,it:,Gt) 

40: else 

41: Vt ^ V't_2 U {u} 


There are Ah nodes at distance h from the virtual source, and by symmetry all of them are equally likely to have been the 
source, giving; 

V{GT\v*,6H{v*,vt) = h) = 

1 


which is independent of h. Thus all nodes in the graph are equally likely to have been the source. The claim is that by choosing 
a{t, h) according to Equation ([3T]i, we satisfy the distribution in 81 
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The state transition can be represented as the usual ((t/2) + 1) x (t/2) dimensional column stochastic matrix: 


a(t, 1) 

1 — a{t, 1) a{t, 2) 

1 - ait, 2) 


pit) 


ait,t/2) 

1 — ait, t/2) 


This relation holds because we have imposed the condition that the virtual source never moves closer to the true source. We 
can solve directly for a(f, 1) = f/(f + 4), and obtain a recursive expression for ait, h) when h> 1: 


h) = ^ - 1)) ■ 


(82) 


We show by induction that this expression evaluates to Equation For h = 2, we have ait, 2) = 
suppose that Equation (|3T|l holds for all h < Hq. We then have 


ait, ho) = 


_ ^0 - _ t-2iho - 1 ) 

f -f 4 ho f ~t“ 4 


t 

t+4 


1 4 

2 t+4 


t-2 

t+4 


. Now 


t-2iho-l) 

t + 4 


which is the claim. 

By construction the ML estimator for even T is to choose any node except the virtual source uniformly at random. For 
odd T, there are two options; either the virtual source stayed fixed or it moved. If the former is true, then spreading occurs 
in one timestep, so the ML estimator once again chooses a node other than the virtual source uniformly at random. If the 
virtual source moved, then Gt is symmetric about the edge connecting the old virtual source to the new one. Since the 
adversary only knows that virtual sources cannot be the true source, the ML estimator chooses one of the remaining Nt — 2 
nodes uniformly at random. This gives a probability of detection of l/iNx — 2). The claim follows from observing that 
Nt > 5(r+l)2-2= iT+3)iT-l) _ 


G. Proof of Proposition \6.1\ 

The control packet at spy node si includes the amount of delay at si = 0 and all descendants of si, which is the set of 
nodes { — 1, —2,.. .}. The control packet at spy node S 2 includes the amount of delay at S2 = n- + 1 and all descendants of 
S 2 , which is the set of nodes {n + 2,n + 3,.. .}. Given this, it is easy to figure out the whole trajectory of the virtual source 
for time t >Ti. Since the virtual source follow i.i.d. Bernoulli trials with probability q, one can exactly figure out q from the 
infinite Bernoulli trials. Also the direction D is trivially revealed. 

To lighten the notations, let us suppose that Ti < T 2 (or equivalently < Tg^). Now using the difference of the observed 
time stamps — Tg^ and the trajectory of the virtual source between Tg^ and Tg^, the adversary can also figure out the time 
stamp Ti with respect to the start of the infection. Further, once the adversary figures out Ti and the location of the virtual 
source vti , the timestamp T 2 does not provide any more information. Hence, the adversary performs ML estimate using Ti, D 
and q. Let Bik,n,q) = denote the pmf of the binomial distribution. Then, the likelihood can be computed 

for Ti as 


,(adaptive) 
Ti|y*,Q,D 


h\v*,q,r) = 


gB(^*_L_2,L_2,g)I 


2 y;-(„.g[2+^,ti]) 

Biv* - 


, if ti even , 
, if ti odd , 



q)BC-^-^ 


ti-3 


,9)1 




, if ti even , 
, if ti odd . 


(83) 


(84) 


This follows from the construction of the adaptive diffusion. The protocol follows a binomial distribution with parameter q 
until (Ti — 1). At time Ti, one of the following can happen: the virtual source can only be passed (the first equation in ([83]l), 
it can only stay (the second equation in ([84|)), or both cases are possible (the second equation in (|83]l). 

Given Ti, Q and D, which are revealed under the adversarial model we consider, the above formula implies that the posterior 
distribution of the source also follows a binomial distribution. Hence, the ML estimate is the mode of a binomial distribution 
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with a shift, for example when ti is even, ML estimate is the mode of 2 + (fi/2) + Z where Z ~ Binom((fi/2) 
adversary can compute the ML estimate: 


r’ML = 


= v*\v*,q) = 

1 

2 




Ti+2 

2 

+ [71 

( Ti-2^ 

i 2 J 

IJ 

if Ti 

even, D 

= i. 

Ti+3 

2 

+ [71 

( Ti-l 1 
1 2 J 

IJ 

if Ti 

odd, D = 

--e, 

+ L(i 

-7)1 

( Ti-lS 
2 J 

IJ 

if Ti 

odd, D = 

= r . 

1 s. 

this gives 





4-1 

2 

* 

-V , 

ti — 3 
2 


1 ^(-Smi. 

—V*) ^(ti is odd) 


2,q). The 


(85) 


( 86 ) 


P^^«(fi,r,F*=t)ML|g) = 

1 , X /fi - 1 fi - 3 \ 

= -t^ML, —II(ti isodd) (87) 

< {l-q) / V2I^t, is odd and ip > 3) 

where uml = VMhiti,q,r) is provided in ( |85] ), and the bound on B{-) follows from Gaussian approximation (which gives an 
upper bound 1/^^2nkq{l — q)) and Berry-Esseen theorem (which gives an approximation guarantee of 2 x 0.4748/- q)) 
lISTl . for k = {ti — 3)/2. Marginalizing out Ti € {3,5,... ,2[(n — 1 )/ 2 J + 1 } and applying an upper bound 1/v^ ^ 
2^/kTl - 2 < 2^/k^ + y/l/(2(k-l)) - 2 < y/4(k - 1), we get 


I(ti=3), 


( 88 ) 


Similarly, we can show that 


P(^l) = r,V* = vml, Ti is oddjQ = q) < 

(1 - q)V2 /g n - 1 I ^ 1 - q 
2n y/q{l — g) V L 2 J 2n 


V{D = i,V* =vml,Ti is odd|Q = g) < 
V2 I n - 1 I 1 

2 ny^q{l — (?) V - 2 J n’ 


(89) 


(90) 


Summing up. 


P(y* = vml,Ti is even|Q = q) < 
/g /I I 1 + g 

2n \/q{l — (?) V - 2 ^ 2 n 


p(y* 


VML\Q = q) < 


iq{l-q) 


+ 


2 

n 


Recall Q is uniformly drawn from [0,1]. Taking expectation over Q gives 


where we 


used /o I/a/^ 


x)dx 


p(y* 


^ml) 



arcsin(l) — arcsin(— 1 ) = tt. 


(91) 


(92) 


(93) 
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